[ https://issues.apache.org/jira/browse/KUDU-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xixu Wang updated KUDU-3438: ---------------------------- Description: The unit test of TabletCopyClientAbortTest maybe core. See the core stack information. {code:java} /root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad status: IO error: Couldn't create tablet metadata: Failed to create TabletMetadata: All healthy data directories are full (error 28) W20230123 18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound connection to 255.255.255.255:1 because connect() failed: Network error: connect(2) error: Network is unreachable (error 101) /root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected: StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the current thread.Actual: it does. /root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112: FailureExpected: TabletCopyTest::SetUp() doesn't generate new fatal failures in the current thread.Actual: it does. W20230123 18:02:20.993108 870018 heartbeater.cc:399] Failed 3 heartbeats in a row: no longer allowing fast heartbeat attempts. *** Aborted at 1674468140 (unix time) try "date -d @1674468140" if you are using GNU date *** PC: @ 0x0 (unknown) *** SIGSEGV (@0x0) received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0; stack trace: *** @ 0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler() @ 0x7f2d7d4c6630 (unknown) @ 0x4a32d0 kudu::tserver::TabletCopyClientTest::StartCopy() @ 0x4a51c8 kudu::tserver::TabletCopyClientAbortTest::SetUp() @ 0x7f2d81704bfe testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x7f2d816f9566 testing::Test::Run() @ 0x7f2d816f9795 testing::TestInfo::Run() @ 0x7f2d816f9cdf testing::TestSuite::Run() @ 0x7f2d816fa29f testing::internal::UnitTestImpl::RunAllTests() @ 0x7f2d8170513e testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x7f2d816f983d testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS() @ 0x7f2d81cc72e6 main @ 0x7f2d77f17555 __libc_start_main @ 0x48e879 (unknown)Segmentation fault (core dumped) {code} The reason is TabletCopyClientTest::SetUp() of TabletCopyClientAbortTest may fail, for example, because of the full disk. TabletCopyClient will be not initialized. Therefore using TabletCopyClient in StartCopy() will cause coredump. !image-2023-01-30-10-10-40-439.png! was: The unit test of TabletCopyClientAbortTest maybe core. See the core stack information. {code:java} // code placeholder {code} /root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad status: IO error: Couldn't create tablet metadata: Failed to create TabletMetadata: All healthy data directories are full (error 28)W20230123 18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound connection to 255.255.255.255:1 because connect() failed: Network error: connect(2) error: Network is unreachable (error 101)/root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected: StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the current thread.Actual: it does./root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112: FailureExpected: TabletCopyTest::SetUp() doesn't generate new fatal failures in the current thread.Actual: it does.W20230123 18:02:20.993108 870018 heartbeater.cc:399] Failed 3 heartbeats in a row: no longer allowing fast heartbeat attempts.*** Aborted at 1674468140 (unix time) try "date -d @1674468140" if you are using GNU date ***PC: @ 0x0 (unknown)*** SIGSEGV (@0x0) received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0; stack trace: ***@ 0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler()@ 0x7f2d7d4c6630 (unknown)@ 0x4a32d0 kudu::tserver::TabletCopyClientTest::StartCopy()@ 0x4a51c8 kudu::tserver::TabletCopyClientAbortTest::SetUp()@ 0x7f2d81704bfe testing::internal::HandleExceptionsInMethodIfSupported<>()@ 0x7f2d816f9566 testing::Test::Run()@ 0x7f2d816f9795 testing::TestInfo::Run()@ 0x7f2d816f9cdf testing::TestSuite::Run()@ 0x7f2d816fa29f testing::internal::UnitTestImpl::RunAllTests()@ 0x7f2d8170513e testing::internal::HandleExceptionsInMethodIfSupported<>()@ 0x7f2d816f983d testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS()@ 0x7f2d81cc72e6 main@ 0x7f2d77f17555 __libc_start_main@ 0x48e879 (unknown)Segmentation fault (core dumped) > The unit test of TabletCopyClientAbortTest maybe core > ----------------------------------------------------- > > Key: KUDU-3438 > URL: https://issues.apache.org/jira/browse/KUDU-3438 > Project: Kudu > Issue Type: Bug > Reporter: Xixu Wang > Priority: Major > Attachments: image-2023-01-30-10-10-40-439.png > > > The unit test of TabletCopyClientAbortTest maybe core. See the core stack > information. > {code:java} > /root/kudu/src/kudu/tserver/tablet_server-test-base.cc:130: FailureFailedBad > status: IO error: Couldn't create tablet metadata: Failed to create > TabletMetadata: All healthy data directories are full (error 28) > W20230123 18:02:20.993072 869956 reactor.cc:684] Failed to create an outbound > connection to 255.255.255.255:1 because connect() failed: Network error: > connect(2) error: Network is unreachable (error 101) > /root/kudu/src/kudu/tserver/tablet_copy-test-base.h:49: FailureExpected: > StartTabletServer(kNumDataDirs) doesn't generate new fatal failures in the > current thread.Actual: it does. > /root/kudu/src/kudu/tserver/tablet_copy_client-test.cc:112: FailureExpected: > TabletCopyTest::SetUp() doesn't generate new fatal failures in the current > thread.Actual: it does. > W20230123 18:02:20.993108 870018 heartbeater.cc:399] Failed 3 heartbeats in a > row: no longer allowing fast heartbeat attempts. > *** Aborted at 1674468140 (unix time) try "date -d @1674468140" if you are > using GNU date *** > PC: @ 0x0 (unknown) > *** SIGSEGV (@0x0) received by PID 868247 (TID 0x7f2d76bb8a00) from PID 0; > stack trace: *** > @ 0x7f2d7964e9f6 google::(anonymous namespace)::FailureSignalHandler() > @ 0x7f2d7d4c6630 (unknown) > @ 0x4a32d0 kudu::tserver::TabletCopyClientTest::StartCopy() > @ 0x4a51c8 kudu::tserver::TabletCopyClientAbortTest::SetUp() > @ 0x7f2d81704bfe testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x7f2d816f9566 testing::Test::Run() > @ 0x7f2d816f9795 testing::TestInfo::Run() > @ 0x7f2d816f9cdf testing::TestSuite::Run() > @ 0x7f2d816fa29f testing::internal::UnitTestImpl::RunAllTests() > @ 0x7f2d8170513e testing::internal::HandleExceptionsInMethodIfSupported<>() > @ 0x7f2d816f983d testing::UnitTest::Run()@ 0x7f2d81cc7f76 RUN_ALL_TESTS() > @ 0x7f2d81cc72e6 main > @ 0x7f2d77f17555 __libc_start_main > @ 0x48e879 (unknown)Segmentation fault (core dumped) > {code} > > > The reason is TabletCopyClientTest::SetUp() of TabletCopyClientAbortTest may > fail, for example, because of the full disk. TabletCopyClient will be not > initialized. Therefore using TabletCopyClient in StartCopy() will cause > coredump. > !image-2023-01-30-10-10-40-439.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)