[kudu] 01/02: KUDU-3373 Key provider interface

2022-06-02 Thread alexey
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit abe2a73cdbe6438e34f825594d66aec53a329840
Author: Attila Bukor 
AuthorDate: Thu May 26 17:17:37 2022 +0200

KUDU-3373 Key provider interface

Kudu's server keys need to be encrypted on the servers, otherwise its
broken, as an attacker who can access Kudu's disks, can easily steal the
server keys used to encrypt the file keys. The cluster key, which will
be used to encrypt/decrypt the server keys, will live outside the
cluster. This commit introduces a key provider interface to
encrypt/decrypt server keys, with a reference (test-only) implementation
which uses memfrob() (a GNU C function that XORs an array with 42). A
follow-up commit will introduce a production-ready implementation that
uses Apache Ranger KMS to provide the keys.

Change-Id: Ie6ccc05fb991f0fd5cbcd8a49f5b23286d1094ac
Reviewed-on: http://gerrit.cloudera.org:8080/18568
Reviewed-by: Alexey Serbin 
Tested-by: Attila Bukor 
Reviewed-by: Zoltan Chovan 
---
 src/kudu/fs/fs_manager.cc  | 26 ++---
 src/kudu/fs/fs_manager.h   |  8 
 src/kudu/mini-cluster/external_mini_cluster.cc | 11 --
 src/kudu/mini-cluster/external_mini_cluster.h  |  6 +++
 src/kudu/server/CMakeLists.txt |  3 ++
 src/kudu/server/default_key_provider-test.cc   | 48 +++
 src/kudu/server/default_key_provider.h | 53 ++
 src/kudu/server/key_provider.h | 41 
 src/kudu/tools/tool_action_common.cc   |  4 +-
 src/kudu/util/env.h|  3 +-
 src/kudu/util/env_posix.cc |  2 +-
 src/kudu/util/test_util.cc |  2 +-
 12 files changed, 193 insertions(+), 14 deletions(-)

diff --git a/src/kudu/fs/fs_manager.cc b/src/kudu/fs/fs_manager.cc
index 56315592d..425fb428a 100644
--- a/src/kudu/fs/fs_manager.cc
+++ b/src/kudu/fs/fs_manager.cc
@@ -49,6 +49,8 @@
 #include "kudu/gutil/strings/substitute.h"
 #include "kudu/gutil/strings/util.h"
 #include "kudu/gutil/walltime.h"
+#include "kudu/server/default_key_provider.h"
+#include "kudu/server/key_provider.h"
 #include "kudu/util/env_util.h"
 #include "kudu/util/flag_tags.h"
 #include "kudu/util/metrics.h"
@@ -144,6 +146,7 @@ using kudu::fs::ReadableBlock;
 using kudu::fs::UpdateInstanceBehavior;
 using kudu::fs::WritableBlock;
 using kudu::pb_util::SecureDebugString;
+using kudu::security::DefaultKeyProvider;
 using std::ostream;
 using std::string;
 using std::unique_ptr;
@@ -193,6 +196,9 @@ FsManager::FsManager(Env* env, FsManagerOpts opts)
 meta_on_xfs_(false) {
   DCHECK(opts_.update_instances == UpdateInstanceBehavior::DONT_UPDATE ||
  !opts_.read_only) << "FsManager can only be for updated if not in 
read-only mode";
+  if (FLAGS_encrypt_data_at_rest) {
+key_provider_.reset(new DefaultKeyProvider());
+  }
 }
 
 FsManager::~FsManager() {}
@@ -457,10 +463,14 @@ Status FsManager::Open(FsReport* report, Timer* 
read_instance_metadata_files,
 read_instance_metadata_files->Stop();
   }
 
-  if (!server_key().empty()) {
-env_->SetEncryptionKey(server_key().length() * 4,
-   reinterpret_cast(
- strings::a2b_hex(server_key()).c_str()));
+  if (!server_key().empty() && key_provider_) {
+string server_key;
+RETURN_NOT_OK(key_provider_->DecryptServerKey(this->server_key(), 
&server_key));
+// 'server_key' is a hexadecimal string and SetEncryptionKey expects bits
+// (hex / 2 = bytes * 8 = bits).
+env_->SetEncryptionKey(reinterpret_cast(
+ strings::a2b_hex(server_key).c_str()),
+   server_key.length() * 4);
   }
 
   // Open the directory manager if it has not been opened already.
@@ -672,14 +682,18 @@ Status 
FsManager::CreateInstanceMetadata(boost::optional uuid,
 metadata->set_uuid(oid_generator_.Next());
   }
   if (server_key) {
-metadata->set_server_key(server_key.get());
+RETURN_NOT_OK(key_provider_->EncryptServerKey(server_key.get(),
+  
metadata->mutable_server_key()));
   } else if (FLAGS_encrypt_data_at_rest) {
 uint8_t key_bytes[32];
 int num_bytes = FLAGS_encryption_key_length / 8;
 DCHECK(num_bytes <= sizeof(key_bytes));
 OPENSSL_RET_NOT_OK(RAND_bytes(key_bytes, num_bytes),
"Failed to generate random key");
-strings::b2a_hex(key_bytes, metadata->mutable_server_key(), num_bytes);
+string plain_server_key;
+strings::b2a_hex(key_bytes, &plain_server_key, num_bytes);
+RETURN_NOT_OK(key_provider_->EncryptServerKey(plain_server_key,
+  
metadata->mutable_server_key()));
   }
 
   str

[kudu] 02/02: [common] add DCHECK to spot bugs in using Schema

2022-06-02 Thread alexey
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git

commit 9699f5885ca18a1760d5d150b3ba82424d8948b3
Author: Alexey Serbin 
AuthorDate: Mon Feb 14 13:56:40 2022 -0800

[common] add DCHECK to spot bugs in using Schema

While reviewing recently posted patch [1], I was looking at the code in
src/kudu/common/schema.h and decided to add more DCHECK-like asserts
to help catching bugs in the implementation of the Schema class itself
and spotting errors at the call sites where Schema objects are used.

This patch doesn't contain any functional changes.

[1] https://gerrit.cloudera.org/#/c/18213/

Change-Id: I44b73ef06924af6556a7cd3da4a7eec20d12aefc
Reviewed-on: http://gerrit.cloudera.org:8080/18231
Reviewed-by: Andrew Wong 
Reviewed-by: Mahesh Reddy 
Tested-by: Alexey Serbin 
---
 src/kudu/common/schema.h | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/src/kudu/common/schema.h b/src/kudu/common/schema.h
index 6662319ae..be93c9aab 100644
--- a/src/kudu/common/schema.h
+++ b/src/kudu/common/schema.h
@@ -534,6 +534,7 @@ class Schema {
   // Return the number of bytes needed to represent
   // only the key portion of this schema.
   size_t key_byte_size() const {
+DCHECK(initialized());
 return col_offsets_[num_key_columns_];
   }
 
@@ -545,6 +546,7 @@ class Schema {
 // division-by-a-constant gets optimized into multiplication,
 // the multiplication instruction has a significantly higher latency
 // than the simple load.
+DCHECK_EQ(cols_.size(), name_to_index_.size());
 return name_to_index_.size();
   }
 
@@ -597,12 +599,11 @@ class Schema {
   // Return the column index corresponding to the given column,
   // or kColumnNotFound if the column is not in this schema.
   int find_column(const StringPiece col_name) const {
-auto iter = name_to_index_.find(col_name);
+const auto iter = name_to_index_.find(col_name);
 if (PREDICT_FALSE(iter == name_to_index_.end())) {
   return kColumnNotFound;
-} else {
-  return (*iter).second;
 }
+return iter->second;
   }
 
   // Returns true if the schema contains nullable columns
@@ -622,8 +623,9 @@ class Schema {
 
   // Returns the list of primary key column IDs.
   std::vector get_key_column_ids() const {
-return std::vector(
-col_ids_.begin(), col_ids_.begin() + num_key_columns_);
+DCHECK_LE(num_key_columns_, col_ids_.size());
+return std::vector(col_ids_.begin(),
+ col_ids_.begin() + num_key_columns_);
   }
 
   // Return true if this Schema is initialized and valid.
@@ -717,15 +719,19 @@ class Schema {
 
   // Return the projection of this schema which contains only
   // the key columns.
-  // TODO: this should take a Schema* out-parameter to avoid an
+  //
+  // TODO(todd): this should take a Schema* out-parameter to avoid an
   // extra copy of the ColumnSchemas.
-  // TODO this should probably be cached since the key projection
+  //
+  // TODO(dralves): this should probably be cached since the key projection
   // is not supposed to change, for a single schema.
   Schema CreateKeyProjection() const {
+DCHECK_LE(num_key_columns_, cols_.size());
 std::vector key_cols(cols_.begin(),
-  cols_.begin() + num_key_columns_);
+   cols_.begin() + num_key_columns_);
 std::vector col_ids;
 if (!col_ids_.empty()) {
+  DCHECK_LE(num_key_columns_, col_ids_.size());
   col_ids.assign(col_ids_.begin(), col_ids_.begin() + num_key_columns_);
 }
 
@@ -869,7 +875,7 @@ class Schema {
 const bool use_column_ids = base_schema.has_column_ids() && 
has_column_ids();
 
 int proj_idx = 0;
-for (int i = 0; i < num_columns(); ++i) {
+for (size_t i = 0; i < num_columns(); ++i) {
   const ColumnSchema& col_schema = cols_[i];
 
   // try to lookup the column by ID if present or just by name.



[kudu] branch master updated (b6aaf2b71 -> 9699f5885)

2022-06-02 Thread alexey
This is an automated email from the ASF dual-hosted git repository.

alexey pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/kudu.git


from b6aaf2b71 KUDU-3368 Encrypt file keys with server keys
 new abe2a73cd KUDU-3373 Key provider interface
 new 9699f5885 [common] add DCHECK to spot bugs in using Schema

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/kudu/common/schema.h   | 24 ++
 src/kudu/fs/fs_manager.cc  | 26 ---
 src/kudu/fs/fs_manager.h   |  8 
 src/kudu/mini-cluster/external_mini_cluster.cc | 11 +++--
 src/kudu/mini-cluster/external_mini_cluster.h  |  6 +++
 src/kudu/server/CMakeLists.txt |  3 ++
 .../default_key_provider-test.cc}  | 30 ++--
 src/kudu/server/default_key_provider.h | 53 ++
 .../test/test_pass.h => server/key_provider.h} | 19 
 src/kudu/tools/tool_action_common.cc   |  4 +-
 src/kudu/util/env.h|  3 +-
 src/kudu/util/env_posix.cc |  2 +-
 src/kudu/util/test_util.cc |  2 +-
 13 files changed, 147 insertions(+), 44 deletions(-)
 copy src/kudu/{util/user-test.cc => server/default_key_provider-test.cc} (65%)
 create mode 100644 src/kudu/server/default_key_provider.h
 copy src/kudu/{security/test/test_pass.h => server/key_provider.h} (66%)