(kudu) branch master updated: [Tool] Return not OK status when copying tablets failed

2024-02-29 Thread alexey
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


The following commit(s) were added to refs/heads/master by this push:
 new d7fa4f9e6 [Tool] Return not OK status when copying tablets failed
d7fa4f9e6 is described below

commit d7fa4f9e62c4e9d87246656d1aea1b450cbf90fd
Author: xinghuayu007 <1450306...@qq.com>
AuthorDate: Thu Feb 29 15:12:14 2024 +0800

[Tool] Return not OK status when copying tablets failed

Currently, 'local_replica copy_from_remote' command does not
return a non-OK status when some tablets copying failed. Therefore
it is not possible to use the return status to know whether it is
failed or succeeded. Only the log can show some useful message.

This patch fixes this problem so now the CLI tool exits with non-OK
status code when some tablets failed to copy.

Change-Id: Ic957cbc379645e0607c1c2a3bc568e20afc126b2
Reviewed-on: http://gerrit.cloudera.org:8080/21089
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
---
 src/kudu/tools/kudu-tool-test.cc| 27 +++
 src/kudu/tools/tool_action_local_replica.cc | 10 +-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/kudu/tools/kudu-tool-test.cc b/src/kudu/tools/kudu-tool-test.cc
index cb15e927c..f5fdc984c 100644
--- a/src/kudu/tools/kudu-tool-test.cc
+++ b/src/kudu/tools/kudu-tool-test.cc
@@ -9539,6 +9539,33 @@ TEST_F(ToolTest, 
TestLocalReplicaCopyRemoteWithSpeedLimit) {
   }
 }
 
+TEST_F(ToolTest, TestLocalReplicaCopyRemoteCanReturnError) {
+  SKIP_IF_SLOW_NOT_ALLOWED();
+  InternalMiniClusterOptions opts;
+  opts.num_tablet_servers = 2;
+  NO_FATALS(StartMiniCluster(std::move(opts)));
+  NO_FATALS(CreateTableWithFlushedData("table1", mini_cluster_.get(), 3, 1));
+  const auto source_tserver_rpc_addr = mini_cluster_->mini_tablet_server(0)
+
->bound_rpc_addr().ToString();
+  const auto wal_dir = 
mini_cluster_->mini_tablet_server(1)->options()->fs_opts.wal_root;
+  const auto data_dirs = JoinStrings(mini_cluster_->mini_tablet_server(1)
+  
->options()->fs_opts.data_roots, ",");
+  NO_FATALS(mini_cluster_->mini_tablet_server(1)->Shutdown());
+
+  // An attempt to copy a non-existent tablet fails, and the return value is 
not OK.
+  string stderr;
+  Status s = RunActionStderrString(
+  Substitute("local_replica copy_from_remote $0 $1 "
+ "-fs_data_dirs=$2 -fs_wal_dir=$3 ",
+ "non-existent-tablet-ids-str",
+ source_tserver_rpc_addr,
+ data_dirs,
+ wal_dir), );
+  ASSERT_TRUE(s.IsRuntimeError());
+  ASSERT_STR_CONTAINS(stderr,
+  "some tablets failed to copy: check error messages for details");
+}
+
 
 class DownloadSuperblockInBatchTest :
 public ToolTest,
diff --git a/src/kudu/tools/tool_action_local_replica.cc 
b/src/kudu/tools/tool_action_local_replica.cc
index 3ca93755d..012f72850 100644
--- a/src/kudu/tools/tool_action_local_replica.cc
+++ b/src/kudu/tools/tool_action_local_replica.cc
@@ -16,6 +16,7 @@
 // under the License.
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -339,7 +340,7 @@ class TabletCopier {
   
FLAGS_tablet_copy_throttler_bytes_per_sec,
   
FLAGS_tablet_copy_throttler_burst_factor);
 }
-
+std::atomic has_failed_tablets(false);
 // Start to copy tablets.
 for (const auto& tablet_id : tablet_ids_to_copy_) {
   RETURN_NOT_OK(copy_pool->Submit([&]() {
@@ -381,6 +382,10 @@ class TabletCopier {
   }
   copying_replicas_by_tablet_id.erase(tablet_id);
 
+  if (!failed_tablet_ids.empty()) {
+has_failed_tablets.store(true);
+  }
+
   LOG(INFO) << Substitute("$0/$1 tablets, $2 bytes copied, include $3 
failed tablets.",
   succeed_tablet_count + 
failed_tablet_ids.size(),
   total_tablet_count,
@@ -396,6 +401,9 @@ class TabletCopier {
 latch.CountDown();
 check_thread->Join();
 
+if (has_failed_tablets.load()) {
+  return Status::RuntimeError("some tablets failed to copy: check error 
messages for details");
+}
 return Status::OK();
   }
 



(kudu) branch master updated: [java] methods for setting run-time flags via test harness

2024-02-29 Thread alexey
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


The following commit(s) were added to refs/heads/master by this push:
 new 0e390e19b [java] methods for setting run-time flags via test harness
0e390e19b is described below

commit 0e390e19b93a40302f53791afbcc95d110c28b57
Author: Alexey Serbin 
AuthorDate: Wed Feb 28 18:47:29 2024 -0800

[java] methods for setting run-time flags via test harness

When I was trying to reproduce an issue seen in the field, I found
that necessary functionality was missing in the Java test harness.
This patch addresses the deficiency, introducing the corresponding
methods to set run-time flags for Kudu masters and tablet servers.
This patch also updates the test harness class to provides a means
to restart a tablet server that hosts a particular tablet replica.

Change-Id: I5ed12b2ef9fd077534528361f6bb42efe3730182
Reviewed-on: http://gerrit.cloudera.org:8080/21093
Reviewed-by: Abhishek Chennaka 
Tested-by: Alexey Serbin 
---
 .../java/org/apache/kudu/test/KuduTestHarness.java | 35 ++
 1 file changed, 35 insertions(+)

diff --git 
a/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java 
b/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java
index 47b6656f0..b8d01621e 100644
--- 
a/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java
+++ 
b/java/kudu-test-utils/src/main/java/org/apache/kudu/test/KuduTestHarness.java
@@ -336,6 +336,30 @@ public class KuduTestHarness extends ExternalResource {
 return hp;
   }
 
+  /**
+   * Set a run-time flag for a tablet server identified by its host and port.
+   * @param hp HostAndPort object identifying the target tablet server
+   * @param flag a flag to set (prefix dash(es) omitted)
+   * @param value a stringified representation of the flag's value to set
+   * @throws IOException
+   */
+  public void setTabletServerFlag(HostAndPort hp, String flag, String value) 
throws IOException {
+miniCluster.setTServerFlag(hp, flag, value);
+  }
+
+  /**
+   * Kills and starts back a tablet server that serves the given tablet's 
leader.
+   * @param tablet a LocatedTablet which is hosted by the target tablet server
+   * @return the host and port of the restarted tablet server
+   * @throws Exception
+   */
+  public HostAndPort restartTabletLeader(LocatedTablet tablet) throws 
Exception {
+HostAndPort hp = findLeaderTabletServer(tablet);
+miniCluster.killTabletServer(hp);
+miniCluster.startTabletServer(hp);
+return hp;
+  }
+
   /**
* Kills and restarts the leader master.
* @return the host and port of the restarted master
@@ -386,6 +410,17 @@ public class KuduTestHarness extends ExternalResource {
 miniCluster.resumeMasterServer(hp);
   }
 
+  /**
+   * Set a run-time flag for a Kudu master identified by its host and port.
+   * @param hp HostAndPort object identifying the target master
+   * @param flag a flag to set (prefix dash(es) omitted)
+   * @param value a stringified representation of the flag's value to set
+   * @throws IOException
+   */
+  public void setMasterFlag(HostAndPort hp, String flag, String value) throws 
IOException {
+miniCluster.setMasterFlag(hp, flag, value);
+  }
+
   /**
* Return the comma-separated list of "host:port" pairs that describes the 
master
* config for this cluster.