[jira] [Commented] (KUDU-3333) Include Table Counts in kudu hms Dryrun

ASF subversion and git services (Jira) Sun, 20 Mar 2022 21:09:10 -0700


    [ 
https://issues.apache.org/jira/browse/KUDU-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509591#comment-17509591
 ]


ASF subversion and git services commented on KUDU-3333:
-------------------------------------------------------

Commit 9a53fec14b9aaa811732f2b7a87da36d893203da in kudu's branch 
refs/heads/master from Abhishek Chennaka
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=9a53fec ]

[tools] KUDU-3333 Include Table Counts in kudu hms Dryrun

In cases where the user running the Kudu CLI tool, kudu hms
fix, doesn't have permissions from Ranger/Sentry to access
the tables, these tables would be treated as non-existant tables
in Kudu. In such scenarios, there might be situations where the
tables could be dropped from HMS inspite of them being present
in Kudu when run with -drop_orphan_tables flag.

This patch adds additional logging which reports the total
table counts from HMS and Kudu master catalogs and warns the
user if there are no tables in Kudu when kudu hms fix command is
run.

Sample runs of the tool before and after the change:
In case of an empty cluster no output is seen without the code
change. After the code change we see the below:
$ ./kudu hms fix `hostname -f`
I0315 16:16:36.039008 351197 tool_action_hms.cc:867] Number of Kudu tables 
found in Kudu master catalog: 0
I0315 16:16:36.039080 351197 tool_action_hms.cc:868] Number of Kudu tables 
found in HMS catalog: 0
$ ./kudu hms fix --dryrun `hostname -f`
I0315 16:16:55.158463 351291 tool_action_hms.cc:642] NOTE: There are zero kudu 
tables listed. If the cluster indeed has kudu tables please re-run the command 
with right credentials.
I0315 16:16:55.158546 351291 tool_action_hms.cc:867] Number of Kudu tables 
found in Kudu master catalog: 0
I0315 16:16:55.158555 351291 tool_action_hms.cc:868] Number of Kudu tables 
found in HMS catalog: 0

In case of a non-empty cluster without the change:
$ kudu hms fix --dryrun `hostname -f` --ignore_other_clusters=false
I0315 16:57:55.329049 365038 tool_action_hms.cc:757] [dryrun] Refreshing HMS 
table metadata for Kudu table default.my_first_table 
[id=408e5696e51c462c86a6d9a84bb95583]
Non-empty cluster after the change:
$ ./kudu hms fix --dryrun `hostname -f`
I0315 16:19:20.885208 352393 tool_action_hms.cc:822] [dryrun] Changing owner of 
default.my_first_table [id=408e5696e51c462c86a6d9a84bb95583] to admin in Kudu 
catalog.
I0315 16:19:20.885274 352393 tool_action_hms.cc:853] [dryrun] Refreshing HMS 
table metadata for Kudu table default.my_first_table 
[id=408e5696e51c462c86a6d9a84bb95583]
I0315 16:19:20.885285 352393 tool_action_hms.cc:867] Number of Kudu tables 
found in Kudu master catalog: 1
I0315 16:19:20.885325 352393 tool_action_hms.cc:868] Number of Kudu tables 
found in HMS catalog: 1

Change-Id: Idf26141d2a3fd6cbb7249b3492fc6a50a0c0aa2d
Reviewed-on: http://gerrit.cloudera.org:8080/18280
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>


> Include Table Counts in kudu hms Dryrun
> ---------------------------------------
>
>                 Key: KUDU-3333
>                 URL: https://issues.apache.org/jira/browse/KUDU-3333
>             Project: Kudu
>          Issue Type: Improvement
>          Components: hms
>            Reporter: David Mollitor
>            Assignee: Abhishek
>            Priority: Minor
>
> I have been bitted several times now by a particular scenario.
> Consider a scenario where the user running {{kudu hms fix}} can see the table 
> metadata in HMS but cannot see the Kudu tables {{kudu table list}} because of 
> some ACL discrepancy between the two services.
> In this case, the kudu tool believes that every record in HMS is orphaned and 
> (given --{{drop_orphan_hms_tables}}) drops *all* of the HMS records. It would 
> break workloads if this were to happen, until the metadata could be restored.
> When running the {{kudu hms fix ... --dryrun=true}} command, it would be 
> really helpful to print (at the bottom of the table) the number of Kudu 
> tables and the number of HMS tables that were fetched. In this way, if I see 
> that either one of them is zero, I can assume there is an ACL issue.
> Thanks.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (KUDU-3333) Include Table Counts in kudu hms Dryrun

Reply via email to