Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23844 )

Change subject: [tools] Add --preserve_table_ids flag to unsafe_rebuild
......................................................................


Patch Set 5:

(3 comments)

Overall looks ok to me.
Just a couple of comments.

http://gerrit.cloudera.org:8080/#/c/23844/5/src/kudu/tools/kudu-admin-test.cc
File src/kudu/tools/kudu-admin-test.cc:

http://gerrit.cloudera.org:8080/#/c/23844/5/src/kudu/tools/kudu-admin-test.cc@3868
PS5, Line 3868: FLAGS_num_replicas = 1;
Here and elsewhere: Any reason replication factor is 1? Can't it be default 
value i.e. 3?


http://gerrit.cloudera.org:8080/#/c/23844/5/src/kudu/tools/kudu-admin-test.cc@4140
PS5, Line 4140:   // 1. The status is RemoteError/NotAuthorized directly, or
              :   // 2. The status is TimedOut but the error message contains 
"Not authorized"
Is Timeout status for eventual failure of scan and remote error for 
intermediate retries?
If that is the case, I would think NotAuthorized should non-retriable error.
If that is not the case, could you please explain which status belongs to which 
request?


http://gerrit.cloudera.org:8080/#/c/23844/5/src/kudu/tools/master_rebuilder.cc
File src/kudu/tools/master_rebuilder.cc:

http://gerrit.cloudera.org:8080/#/c/23844/5/src/kudu/tools/master_rebuilder.cc@64
PS5, Line 64: It is also useful for maintaining compatibility with external 
systems "
            :             "(e.g., Impala, HMS) that reference tables by their 
original IDs.
IIRC, if master rebuild happens and a new table ID is generated for a table and 
if there is a reference to the old table ID (I believe it is stored in HMS) in 
impala, the next steps would be to invalidate the metadata on impala and make 
it fetch the latest ID from kudu cluster. Invalidation would ensure that impala 
don't work with stale table IDs.

A few minor suggestions:
1. I think it would be worth exploring the impact of this change on impala. I 
am not sure if there is any but it is good to know with certainty. If there is, 
document the same in the form of a ticket or todo.
2. Since this flag is applicable only to master rebuild scenario, it is 
important to note that here in the description.
3. Also, make sure this change has no impact on other operations that could 
potentially change the table IDs. Maybe add a unit test case for such 
operations (if you think it is required).
4. If not done already, please run a manual test with impala. The idea is to 
verify that impala makes use of cached old table ID to query the table (after a 
master rebuild) without invalidating and re-fetching the table metadata.



--
To view, visit http://gerrit.cloudera.org:8080/23844
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ae4353564922312d646f0323271d804e32e3b0d
Gerrit-Change-Number: 23844
Gerrit-PatchSet: 5
Gerrit-Owner: Yan-Daojiang <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Gabriella Lotz <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yan-Daojiang <[email protected]>
Gerrit-Comment-Date: Mon, 19 Jan 2026 06:22:12 +0000
Gerrit-HasComments: Yes

Reply via email to