pitrou commented on code in PR #45411:
URL: https://github.com/apache/arrow/pull/45411#discussion_r1946183485


##########
docs/source/cpp/parquet.rst:
##########
@@ -585,6 +585,82 @@ More specifically, Parquet C++ supports:
 * EncryptionWithFooterKey and EncryptionWithColumnKey modes.
 * Encrypted Footer and Plaintext Footer modes.
 
+Configuration
+~~~~~~~~~~~~~
+
+An example for writing a dataset using encrypted Parquet file format:
+
+.. code-block:: cpp
+
+   #include <arrow/util/logging.h>
+
+   #include "arrow/dataset/file_parquet.h"
+   #include "arrow/dataset/parquet_encryption_config.h"
+   #include "arrow/testing/gtest_util.h"
+   #include "parquet/encryption/crypto_factory.h"
+
+   using arrow::internal::checked_pointer_cast;
+
+   auto crypto_factory = 
std::make_shared<parquet::encryption::CryptoFactory>();
+   parquet::encryption::KmsClientFactory kms_client_factory = ...;
+   crypto_factory->RegisterKmsClientFactory(std::move(kms_client_factory));
+   auto kms_connection_config = 
std::make_shared<parquet::encryption::KmsConnectionConfig>();
+
+   // Set write options with encryption configuration.
+   auto encryption_config =
+       std::make_shared<parquet::encryption::EncryptionConfiguration>(
+           std::string("footer_key"));
+   encryption_config->column_keys = "col_key: a";
+   auto parquet_encryption_config = 
std::make_shared<ParquetEncryptionConfig>();
+   // Directly assign shared_ptr objects to ParquetEncryptionConfig members
+   parquet_encryption_config->crypto_factory = crypto_factory;
+   parquet_encryption_config->kms_connection_config = kms_connection_config;
+   parquet_encryption_config->encryption_config = std::move(encryption_config);
+
+   auto file_format = std::make_shared<ParquetFileFormat>();
+   auto parquet_file_write_options =
+       
checked_pointer_cast<ParquetFileWriteOptions>(file_format->DefaultWriteOptions());
+   parquet_file_write_options->parquet_encryption_config =
+       std::move(parquet_encryption_config);
+
+   // Write dataset.
+   arrow::Table table = ...;
+   auto dataset = std::make_shared<InMemoryDataset>(table);
+   EXPECT_OK_AND_ASSIGN(auto scanner_builder, dataset->NewScan());
+   EXPECT_OK_AND_ASSIGN(auto scanner, scanner_builder->Finish());
+
+   FileSystemDatasetWriteOptions write_options;
+   write_options.file_write_options = parquet_file_write_options;
+   write_options.base_dir = "example.parquet";
+   ARROW_CHECK_OK(FileSystemDataset::Write(write_options, std::move(scanner)));
+
+Column encryption is configured by setting ``encryption_config->column_keys`` 
to a string
+of the format ``"masterKeyID:colName,colName;masterKeyID:colName..."``.

Review Comment:
   Any improvements in the source documentation are welcome, if you have time 
and motivation for them :) 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to