[
https://issues.apache.org/jira/browse/KUDU-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913134#comment-17913134
]
ASF subversion and git services commented on KUDU-3486:
-------------------------------------------------------
Commit f6c3253e95ed9b44603a58f8cb9614c222040d3f in kudu's branch
refs/heads/branch-1.18.x from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=f6c3253e9 ]
[tserver] disable KUDU-3486 behavior by default
This is a quick-and-dirty fix to mitigate KUDU-3638.
This patch isn't focusing on properly addressing the issues that
KUDU-3486 has introduced apart from fixing the obvious bug of missing
updates of the Heartbeater::Thread::last_tombstoned_report_time_ field.
Also, with this patch, the functionality introduced with KUDU-3486
is now disabled by default. To re-enable it back, customize the setting
for the --tserver_send_tombstoned_tablets_report_inteval_secs flag,
if needed.
Properly implementing the functionality that KUDU-3486 attempted to add
would be a much more involved patch because there are several items
to address from both the design and implementation standpoints.
Change-Id: I8e32aafab99c74f0ead3ba65aea58ce91d40297c
Reviewed-on: http://gerrit.cloudera.org:8080/22341
Reviewed-by: Abhishek Chennaka <[email protected]>
Tested-by: Alexey Serbin <[email protected]>
(cherry picked from commit 0ddaac556f7bc7aeb47db740300921d10eabd856)
Reviewed-on: http://gerrit.cloudera.org:8080/22344
> Tserver: Too many tombstone tablet may lead to high memory usage.
> -----------------------------------------------------------------
>
> Key: KUDU-3486
> URL: https://issues.apache.org/jira/browse/KUDU-3486
> Project: Kudu
> Issue Type: Bug
> Components: tserver
> Affects Versions: 1.14.0
> Reporter: Song Jiacheng
> Priority: Minor
> Fix For: 1.18.0, 1.17.1
>
> Attachments: image-2023-07-06-15-59-44-181.png
>
>
> There are two kinds of tablet replica deletion: tombstone and delete. A
> tombstone tablet replica might never be deleted since the delete-type
> deletion could only occur when the tablet is deleted, and the requests will
> be sent to the voters, not including the tombstone ones.
> Here is a example:
> Tablet T:
> replica A
> replica B
> replica C
> After rebalance:
> replica A
> replica B
> replica C(Tombstone)
> replica D
> When the tablet T is deleted, A B D are deleted, and C exists forever.
> Like this picture, the tablet had already been deleted at 3:00 am 13th Jun,
> but the tombstone replica still exists.
> !image-2023-07-06-15-59-44-181.png|width=568,height=261!
> The data of tombstone replica is deleted, but metadata is persisted in
> memory, especially the biggest one SchemaPB will occupy a lot of memory.
> In some of our clusters, tombstone replicas of each tserver could reach 50k ~
> 100k, which takes about 10G.
> It takes too much resource if adds a vector for each tablet to store the
> history tablet servers that used to hold a replica of the tablet. So I think
> periodically heartbeat might be a good way to solve the problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)