This is an automated email from the ASF dual-hosted git repository.
broustant pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/solr-sandbox.git
The following commit(s) were added to refs/heads/main by this push:
new 3d6d404 Update ENCRYPTION.md for the distrib=true param. (#121)
3d6d404 is described below
commit 3d6d404fe4d03bb92bbf0b286b32940109d84502
Author: Bruno Roustant <[email protected]>
AuthorDate: Mon Aug 4 18:29:09 2025 +0200
Update ENCRYPTION.md for the distrib=true param. (#121)
---
ENCRYPTION.md | 44 +++++++++++++++++++++++++++-----------------
1 file changed, 27 insertions(+), 17 deletions(-)
diff --git a/ENCRYPTION.md b/ENCRYPTION.md
index 4bb948a..ae45f75 100644
--- a/ENCRYPTION.md
+++ b/ENCRYPTION.md
@@ -10,20 +10,26 @@ obviously the key secret is never stored). It is possible
to define a different
provides an EncryptionRequestHandler so that a client can trigger the
(re)encryption of a Solr Core index. The
(re)encryption is done concurrently while the Solr Core can continue to serve
update and query requests.
+A custom key "cookie" can be stored in the commit metadata if it is required
to get the key secret. For example, it can
+be the key secret in a wrapped (encrypted) form that only a Key Management
System can decrypt.
+
In addition, the Solr update logs are also encrypted when the Solr Core index
is encrypted. When the active encryption
key changes for the Solr Core, the re-encryption of the update logs is done
synchronously when an old log file is
opened for addition. This re-encryption is nearly as fast as a file copy.
+This module also ensures that replication and backup can copy and restore the
index files in their encrypted form.
+
Comparing with an OS-level encryption:
- OS-level encryption [1][2] is more performant and more adapted to let Lucene
leverage the OS memory cache. It can
manage encryption at block or filesystem level in the OS. This makes it
possible to encrypt with different keys
-per-directory, making multi-tenant use-cases possible. If you can use OS-level
encryption, prefer it and skip this
-Java-level encryption.
+per-directory, making multi-tenant use-cases possible. If you control and can
use an OS-level encryption, prefer it
+compared to this Java-level encryption.
- Java-level encryption can be used when the OS-level encryption management is
not possible (e.g. host machine managed
by a cloud provider), or when even admin rights should not allow to get clear
access to the index files. It has an
-impact on performance: expect -20% on most queries, -60% on multi-term queries.
+impact on performance: expect -20% on most queries, -60% on multi-term
queries. Although, the impact could be less
+important if/when we support JDK 24.
[1] https://wiki.archlinux.org/title/Fscrypt
@@ -39,18 +45,8 @@ needs specific parameters to get a key.
## Installing and Configuring the Encryption Plug-In
-1. Configure the sharedLib directory in solr.xml (e.g. sharedLIb=lib) and
place the Encryption plug-in jar file into
-the specified folder.
-
-**solr.xml**
-
-```xml
-<solr>
-
- <str name="sharedLib">${solr.sharedLib:}</str>
-
-</solr>
-```
+1. Place the Encryption plug-in jar file in the lib directory.
+See https://solr.apache.org/guide/solr/latest/configuration-guide/libs.html
for details.
2. Configure the Encryption classes in solrconfig.xml.
@@ -99,7 +95,7 @@ to use. By default `CipherAesCtrEncrypter$Factory` is used.
You can change to `L
more lightweight and efficient implementation (+10% perf), but it calls an
internal com.sun.crypto.provider.AESCrypt()
constructor which either logs a JDK warning (Illegal reflective access) with
JDK 16 and below, or with JDK 17 and above
requires to open the access to the com.sun.crypto.provider package with the
jvm arg
-`--add-opens=java.base/com.sun.crypto.provider=ALL-UNNAMED`. Both support
encrypting files up to 17 TB.
+`--add-opens=java.base/com.sun.crypto.provider=ALL-UNNAMED`. Both support
encrypting files up to 17 TB per file.
`EncryptionUpdateHandler` replaces the standard `DirectUpdateHandler2` (which
it extends) to store persistently the
encryption key id in the commit metadata. It supports all the configuration
parameters of `DirectUpdateHandler2`.
@@ -151,7 +147,10 @@ the parameters `tenantId` and `encryptionKeyBlob` to be
sent in the `SolrQueryRe
Once Solr is set up, it is ready to encrypt. To set the encryption key id to
use, the Solr client calls the
`EncryptionRequestHandler` at `/admin/encrypt`.
-`EncryptionRequestHandler` handles an encryption request for a specific Solr
core.
+By default, `EncryptionRequestHandler` handles an encryption request for a
specific Solr core. In Solr Cloud mode, it
+is also possible to add the `distrib=true` parameter to have this handler
distribute the encryption request to all the
+leader replicas of all the shards of the collection, ensuring they all encrypt
their index shard (it supports the
+`timeAllowed` parameter with a milliseconds timeout).
The caller provides the mandatory `encryptionKeyId` request parameter to
define the encryption key id to use to encrypt
the index files. To decrypt the index to cleartext, the special parameter
value `no_key_id` must be provided.
@@ -159,6 +158,7 @@ the index files. To decrypt the index to cleartext, the
special parameter value
The encryption processing is asynchronous. The request returns immediately
with two response parameters.
- `encryptionState` parameter with value either `pending`, `complete`, or
`busy`.
- `status` parameter with values either `success` or `failure`.
+If `distrib=true`, the `encryptionState` is `complete` only if all the shards
encryption are complete.
The expected usage of this handler is to first send an encryption request with
a key id, and to receive a response with
`status`=`success` and `encryptionState`=`pending`. If the caller needs to
know when the encryption is complete, it can
@@ -179,6 +179,16 @@ If your `KeySupplier` implementation requires specific
parameters to supply keys
This encryption module implements AES-CTR.
+Rationale about the choice of the CTR-Mode:
+- simple, efficient, random-access.
+- adapted to Lucene immutable index files.
+- file integrity and error detection checks are verified by Lucene checksums.
+- nonce-misuse resistance is implemented by building a secure random IV.
+- used in combination with a strong AES cipher.
+
+The random IV is composed of 5 bytes for the CTR counter, supporting up to 17
TB per file, and 11 bytes for the random
+nonce.
+
AES-CTR compared to AES-XTS:
Lucene produces read-only files per index segment. Since we have a new random
IV per file, we don't repeat the same AES
encrypted blocks. So we are in a safe write-once case where AES-XTS and
AES-CTR have the same strength [1][2]. CTR was