Hi, We already have EAR for celeborn shuffle data internally at LinkedIn, where we have added this support to respect the existing spark.io.encryption.enabled config in Spark on the client side.
I am happy to contribute this back and start a CIP for this next week. Thanks, Aravind On Sat, Apr 18, 2026 at 10:24 PM Karthik Prabhakar <[email protected]> wrote: > Hi dev@, > > I’d like to propose adding at-rest encryption for shuffle data in Celeborn > and would appreciate the community’s input before writing a full > implementation. > cURRENT gap > > Celeborn encrypts data in transit (TLS, SASL) but not at rest. When a > worker flushes shuffle data to local disk, HDFS, S3, or OSS, the bytes land > as plaintext. > > The only write site for local disk is LocalFlushTask.flush() in > FlushTask.scala (L66, L71 at commit a56f69a), which calls > fileChannel.write(buffer) with no cipher transform. The tiered-storage > paths (HdfsFlushTask, S3FlushTask, OssFlushTask) are the same — raw bytes > to the underlying store. > > Verified with: > > grep -rnE 'cipher|\.encrypt|aes|envelope' worker/src/main/ > grep -rn 'javax\.crypto' worker/src/main/ > (both zero matches) > > This matters because spark.io.encryption.enabled does *not* cover the > Celeborn path. When Celeborn’s ShuffleManager replaces Spark’s shuffle > writer, Spark’s encryption key is never consulted — confirmed by grepping > client-spark/ for IOEncryptionKey (zero matches). > > Teams adopting Celeborn for performance silently lose shuffle-encryption > guarantees their compliance posture may assume. > Who Needs This > > - Regulated industries (healthcare, finance, public sector) whose > auditors require application-layer encryption independent of disk/volume > encryption. > - Multi-tenant platforms needing cryptographic isolation between tenants > on shared workers. > - Teams using object-store tiering who want encryption before offload. > > Proposed Approach (High Level) > > 1. A *StreamCipher SPI* in common/ for wrapping WritableByteChannel / > ReadableByteChannel with encrypt/decrypt. No KMS SDK in core. > 2. A *KeyService SPI* for envelope encryption — generate/unwrap DEKs > using a KMS-held KEK. Implementations live in separate optional modules > ( > aws-kms, gcp-kms, azure-kv, vault, static for dev/PoC). > 3. Wire into the worker write path: LocalFlushTask wraps fileChannel > with StreamCipher.wrapForWrite(). Same for HDFS/S3/OSS flush tasks. > 4. Wire into the reader path: LocalPartitionDataReader detects a 16-byte > encrypted-file header, unwraps the DEK (cached per worker+shuffle), > wraps > the channel with StreamCipher.wrapForRead(). > 5. Opt-in via celeborn.shuffle.io.encryption.enabled=true. Default off. > Unencrypted deployments are byte-identical to today, zero overhead. > 6. Per-shuffle DEKs by default (one KMS call per shuffle reservation, > amortized). Per-application DEK scope as an option. > > Interaction with Recent Work > > CELEBORN-2301 (commit 95419e1) recently landed enhanced zero-copy sendfile > for FileRegion on native transports — a nice throughput win for the fetch > path. > > Encryption and sendfile are fundamentally incompatible: sendfile(2) cannot > transform bytes, so encrypted partitions must use a buffered read path. > This is only relevant for encrypted workloads; unencrypted workloads on the > same cluster keep the full CELEBORN-2301 benefit. Per-application > encryption flags (not per-cluster) would let encrypted and unencrypted apps > coexist without regressing the latter. > Questions for the Community > > Trimming to three since these are the ones I’d need opinions on before > writing code. Happy to take the rest up in follow-ups. > > - Any prior design work or internal discussion on this topic I should > know about before proceeding? > - *Per-shuffle vs. per-application DEK scope* as the default? > Per-shuffle gives smaller blast radius and simpler lifecycle; > per-application amortizes KMS round-trips and is friendlier for > long-running jobs. > - *Key distribution path:* wrapped DEKs flow through Master metadata > (simpler, one KMS-aware role) vs. workers unwrap directly from KMS > (removes > Master from the key path, but every worker needs KMS credentials). > Preference? > > Tracking > > JIRA: CELEBORN-2311 <https://issues.apache.org/jira/browse/CELEBORN-2311> > > I have a detailed design document with source citations, threat model, > performance analysis, and phased implementation plan. Happy to share > on-list or off-list if there’s interest. > > - Karthik > -- Aravind K. Patnam
