Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, Hao Hao, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/13549 to look at the new patch set (#6). Change subject: sentry: avoid authorizing every table in ListTables ...................................................................... sentry: avoid authorizing every table in ListTables Currently ListTables will call into Sentry for every table in Kudu's catalog, checking whether the user has metadata privileges on the table, and adding it to the ListTablesResponsePB if so. This is expensive, particularly when there are thousands of tables in Kudu. This patch updates the authorization behavior to check whether the user has METADATA privileges on the table's database for each table. If no such privilege exists for the database, each table within the database is checked. A benchmark is added to gauge the performance in various scenarios: Iterating through many databases: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 25.243s user 0.015s sys 0.001s Iterating through many databases and many tables. This is the worst case since listing a given table will require two lookups in Sentry -- one for the database and one for the table: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=300 --num_tables_per_db=1 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.543s user 0.040s sys 0.013s This above worst case is actually less performant than without this patch. A follow-up patch will make this more performant by caching privileges from requests at the database scope. Iterating through one database and no tables since the user has database-level privileges: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=true sentry_authz_provider-test.cc:418] Time spent Listing tables: real 0.271s user 0.000s sys 0.000s Iterating through one database and many tables: $ ./bin/sentry_authz_provider-test --gtest_filter=*ListTablesBench* --num_databases=1 --num_tables_per_db=600 --has-db-privileges=false sentry_authz_provider-test.cc:418] Time spent Listing tables: real 47.441s user 0.031s sys 0.013s Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 --- M src/kudu/master/sentry_authz_provider-test.cc M src/kudu/master/sentry_authz_provider.cc 2 files changed, 79 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/49/13549/6 -- To view, visit http://gerrit.cloudera.org:8080/13549 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c495c635fbd2a661709b6d7f04973b7864b5527 Gerrit-Change-Number: 13549 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241)