[
https://issues.apache.org/jira/browse/KUDU-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Bukor reassigned KUDU-3585:
----------------------------------
Assignee: Alexey Serbin
> ClientTest.ClearCacheAndConcurrentWorkload fails from time to time in TSAN
> builds
> ---------------------------------------------------------------------------------
>
> Key: KUDU-3585
> URL: https://issues.apache.org/jira/browse/KUDU-3585
> Project: Kudu
> Issue Type: Sub-task
> Components: client, test
> Affects Versions: 1.14.0, 1.15.0, 1.16.0, 1.17.0
> Reporter: Alexey Serbin
> Assignee: Alexey Serbin
> Priority: Major
> Fix For: 1.18.0
>
> Attachments: client-test.5.txt.xz
>
>
> The scenario sometimes fails in TSAN builds with output like cited below.
> It seems the root cause was RPC queue overflows at kudu-master and
> kudu-tserver: both spend much more time on regular requests when built with
> TSAN instrumentation, and resetting the client'ss meta-cache too often
> induces a lot of GetTableLocations requests, and serving eats a lot of CPU
> and many threads are kept busy. Since an internal mini-cluster is used in
> the scenario (i.e. all masters and tablet servers are a part of just one
> process), that affects kudu-tserver RPC worker threads as well, so many
> requests accumulate in the RPC queues.
> {noformat}
> src/kudu/client/client-test.cc:408: Failure
> Expected equality of these values: 0
>
> server->server()->rpc_server()->
> service_pool("kudu.tserver.TabletServerService")->
> RpcsQueueOverflowMetric()->value()
> Which is: 1
> src/kudu/client/client-test.cc:584: Failure
> Expected: CheckNoRpcOverflow() doesn't generate new fatal failures in the
> current thread.
> Actual: it does.
>
> src/kudu/client/client-test.cc:2466: Failure
> Expected: DeleteTestRows(client_table_.get(), kLowIdx, kHighIdx) doesn't
> generate new fatal failures in the current thread.
> Actual: it does.
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)