Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Hao Hao, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#3). Change subject: sentry: avoid authorizing every table in ListTables ...................................................................... sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 21.184s user 0.023s sys 0.003s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 44.448s user 0.037s sys 0.018s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:420] Time spent Listing tables: real 0.313s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:420] Time spent Listing tables: real 43.180s user 0.031s sys 0.011s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/CMakeLists.txt M src/kudu/master/authz_provider.h M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/default_authz_provider.cc M src/kudu/master/default_authz_provider.h M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc M src/kudu/master/sentry_authz_provider.h 9 files changed, 237 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/3 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)