pitrou commented on code in PR #45411:
URL: https://github.com/apache/arrow/pull/45411#discussion_r2010357785
##########
docs/source/cpp/parquet.rst:
##########
@@ -585,6 +585,51 @@ More specifically, Parquet C++ supports:
* EncryptionWithFooterKey and EncryptionWithColumnKey modes.
* Encrypted Footer and Plaintext Footer modes.
+Configuration
+~~~~~~~~~~~~~
+
+Parquet encryption uses a ``parquet::encryption::CryptoFactory`` that has
access to a
+Key Management System (KMS), which stores actual encryption keys, referenced
by key ids.
+The Parquet encryption configuration only uses key ids, no actual keys.
+
+Parquet metadata encryption is configured via
``parquet::encryption::EncryptionConfiguration``:
+
+.. literalinclude:: ../../../cpp/examples/arrow/parquet_column_encryption.cc
+ :language: cpp
+ :start-at: // Set write options with encryption configuration
+ :end-before: encryption_config->column_keys
+ :dedent: 2
+
+All columns are encrypted with the same key as the Parquet metadata when above
+``encryption_config->uniform_encryption`` is set ``true``.
+
+Individual columns are encrypted with individual keys as configured via
``encryption_config->column_keys``.
+This field expects a string of the format
``"columnKeyID:colName,colName;columnKeyID:colName..."``.
+
+.. literalinclude:: ../../../cpp/examples/arrow/parquet_column_encryption.cc
+ :language: cpp
+ :start-at: // Set write options with encryption configuration
+ :end-before: auto parquet_encryption_config
+ :emphasize-lines: 4-5
+ :dedent: 2
+
+See the full `Parquet column encryption example
<examples/parquet_column_encryption.html>`_.
+
+.. note::
+
+ Encrypting columns that have nested fields (struct, map, or even list data
types)
+ requires column keys for the inner fields, not the column itself.
+ Configuring a column key for the column itself causes this error (here
column name is ``col``):
+
+ .. code-block::
+
+ OSError: Encrypted column col not in file schema
+
+ The key and value fields of a map column ``m`` has the names
``m.key_value.key``
+ and ``m.key_value.value``, respectively. The inner field of a list column
``l``
+ has the name ``l.list.element``. An inner field ``f`` of a struct column
``s`` has
+ the name ``s.f``.
Review Comment:
```suggestion
Conventionally, the key and value fields of a map column ``m`` have the
names
``m.key_value.key`` and ``m.key_value.value``, respectively. The inner
field of a
list column ``l`` has the name ``l.list.element``. An inner field ``f``
of a struct column ``s`` has
the name ``s.f``.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]