ivandika3 commented on code in PR #407:
URL: https://github.com/apache/ozone-site/pull/407#discussion_r3214197051


##########
docs/07-system-internals/02-data-operations/02-read.md:
##########
@@ -3,27 +3,118 @@ draft: true
 sidebar_label: Read
 ---
 
-# Implementation of Read Operations
-
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
-
-## Reading Metadata
-
-## Reading Data
-
-Trace every part of a read request from beginning to end. This includes:
-
-- Client getting encryption keys
-- Client calling OM to create key
-- OM validating client's Kerberos principal
-- OM checking permissions (Ranger or Native ACLs)
-- OM generating block tokens from the shared secret previously retrieved from 
SCM
-- OM getting block locations from SCM or from its cache.
-- OM returning container, blocks, pipeline, block tokens
-- Client sending block tokens and Datanode validating based on the shared 
secret from SCM
-- Client sending read chunk requests to Datanode to fetch the data.
-  - For replication:
-    - Include topology choices of which Datanodes to use
-    - Include failover handling
-  - For EC, link to the [EC feature page](../features/erasure-coding).
-- Client validating checksums
+# Apache Ozone Internals: Read Operation Implementation Guide
+
+This guide provides a comprehensive trace of a read request in Apache Ozone, 
including metadata resolution, security (Block Tokens), Transparent Data 
Encryption (TDE), and Authorization (Ranger/Native ACLs).
+
+---
+
+## 1. Phase 1: Request & Authorization (Client & OM)
+
+### 1.1 Initiating the Request
+
+The application calls `OzoneBucket.readKey(key)`. The client sends a 
`lookupKey` RPC to the Ozone Manager (OM).
+
+### 1.2 OM: Authorization Check
+
+Before processing the lookup, the OM must authorize the user.
+
+1. **Entry Point:** `OmMetadataReader.checkAcls()` is called within the 
`lookupKey` flow.
+2. **Authorizer Selection:** Based on configuration 
(`ozone.acl.authorizer.class`), OM uses either:
+   - **Native Authorizer:** Uses Ozone's internal ACLs stored in RocksDB.
+   - **Apache Ranger Authorizer:** Delegates the decision to the Ranger Ozone 
Plugin (`RangerOzoneAuthorizer`).
+3. **Authorization Logic:**
+   - OM builds an `OzoneObj` (Volume/Bucket/Key) and a `RequestContext` (User, 
IP, Action: READ).
+   - **Ranger Flow:** The plugin checks its local cache of policies 
(periodically synced from the Ranger Admin server). If a policy allows READ for 
the user/group on that resource, access is granted.
+   - **Fallback:** If Ranger is disabled or the Native authorizer is used, OM 
checks the object's ACL list for matching user/group permissions.
+
+### 1.3 OM: Key & Encryption Resolution
+
+Once authorized:
+
+1. **Key Lookup:** OM finds the `OmKeyInfo` in the `keyTable`.
+2. **Encryption Check:** If TDE is enabled, the `OmKeyInfo` contains the 
`EDEK` (Encrypted Data Encryption Key) and the EZ Key Name.
+3. **Block Allocation:** OM retrieves the `OmKeyLocationInfo` (Block IDs and 
Pipelines).

Review Comment:
   "Block Allocation" seem awkwards, use "Block Retrieval" instead.
   
   Also might want to mention the container location cache implementation as 
well as the locality aware implementation (sorting datanodes with the Network 
Topology cached in OM from SCM)



##########
docs/07-system-internals/02-data-operations/02-read.md:
##########
@@ -3,27 +3,118 @@ draft: true
 sidebar_label: Read
 ---
 
-# Implementation of Read Operations
-
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
-
-## Reading Metadata
-
-## Reading Data
-
-Trace every part of a read request from beginning to end. This includes:
-
-- Client getting encryption keys
-- Client calling OM to create key
-- OM validating client's Kerberos principal
-- OM checking permissions (Ranger or Native ACLs)
-- OM generating block tokens from the shared secret previously retrieved from 
SCM
-- OM getting block locations from SCM or from its cache.
-- OM returning container, blocks, pipeline, block tokens
-- Client sending block tokens and Datanode validating based on the shared 
secret from SCM
-- Client sending read chunk requests to Datanode to fetch the data.
-  - For replication:
-    - Include topology choices of which Datanodes to use
-    - Include failover handling
-  - For EC, link to the [EC feature page](../features/erasure-coding).
-- Client validating checksums
+# Apache Ozone Internals: Read Operation Implementation Guide
+
+This guide provides a comprehensive trace of a read request in Apache Ozone, 
including metadata resolution, security (Block Tokens), Transparent Data 
Encryption (TDE), and Authorization (Ranger/Native ACLs).
+
+---
+
+## 1. Phase 1: Request & Authorization (Client & OM)
+
+### 1.1 Initiating the Request
+
+The application calls `OzoneBucket.readKey(key)`. The client sends a 
`lookupKey` RPC to the Ozone Manager (OM).

Review Comment:
   Use `getKeyInfo` instead, `lookupKey` is already deprecated.



##########
docs/07-system-internals/02-data-operations/02-read.md:
##########
@@ -3,27 +3,118 @@ draft: true
 sidebar_label: Read
 ---
 
-# Implementation of Read Operations
-
-**TODO:** File a subtask under 
[HDDS-9862](https://issues.apache.org/jira/browse/HDDS-9862) and complete this 
page or section.
-
-## Reading Metadata
-
-## Reading Data
-
-Trace every part of a read request from beginning to end. This includes:
-
-- Client getting encryption keys
-- Client calling OM to create key
-- OM validating client's Kerberos principal
-- OM checking permissions (Ranger or Native ACLs)
-- OM generating block tokens from the shared secret previously retrieved from 
SCM
-- OM getting block locations from SCM or from its cache.
-- OM returning container, blocks, pipeline, block tokens
-- Client sending block tokens and Datanode validating based on the shared 
secret from SCM
-- Client sending read chunk requests to Datanode to fetch the data.
-  - For replication:
-    - Include topology choices of which Datanodes to use
-    - Include failover handling
-  - For EC, link to the [EC feature page](../features/erasure-coding).
-- Client validating checksums
+# Apache Ozone Internals: Read Operation Implementation Guide
+
+This guide provides a comprehensive trace of a read request in Apache Ozone, 
including metadata resolution, security (Block Tokens), Transparent Data 
Encryption (TDE), and Authorization (Ranger/Native ACLs).
+
+---
+
+## 1. Phase 1: Request & Authorization (Client & OM)
+
+### 1.1 Initiating the Request
+
+The application calls `OzoneBucket.readKey(key)`. The client sends a 
`lookupKey` RPC to the Ozone Manager (OM).
+
+### 1.2 OM: Authorization Check
+
+Before processing the lookup, the OM must authorize the user.
+
+1. **Entry Point:** `OmMetadataReader.checkAcls()` is called within the 
`lookupKey` flow.
+2. **Authorizer Selection:** Based on configuration 
(`ozone.acl.authorizer.class`), OM uses either:
+   - **Native Authorizer:** Uses Ozone's internal ACLs stored in RocksDB.
+   - **Apache Ranger Authorizer:** Delegates the decision to the Ranger Ozone 
Plugin (`RangerOzoneAuthorizer`).
+3. **Authorization Logic:**
+   - OM builds an `OzoneObj` (Volume/Bucket/Key) and a `RequestContext` (User, 
IP, Action: READ).
+   - **Ranger Flow:** The plugin checks its local cache of policies 
(periodically synced from the Ranger Admin server). If a policy allows READ for 
the user/group on that resource, access is granted.
+   - **Fallback:** If Ranger is disabled or the Native authorizer is used, OM 
checks the object's ACL list for matching user/group permissions.
+
+### 1.3 OM: Key & Encryption Resolution
+
+Once authorized:
+
+1. **Key Lookup:** OM finds the `OmKeyInfo` in the `keyTable`.
+2. **Encryption Check:** If TDE is enabled, the `OmKeyInfo` contains the 
`EDEK` (Encrypted Data Encryption Key) and the EZ Key Name.
+3. **Block Allocation:** OM retrieves the `OmKeyLocationInfo` (Block IDs and 
Pipelines).
+4. **Block Token Generation:** OM generates a signed Block Token for each 
block using secret keys managed by the SCM.
+
+OM returns `OmKeyInfo` (Metadata + `EDEK` + Block Tokens) to the client.
+
+---
+
+## 2. Phase 2: Decryption Setup (Client & KMS)
+
+### 2.1 Decrypting the `EDEK`
+
+If the key is encrypted:
+
+1. **KMS Request:** The client sends the `EDEK` to the KMS (Key Management 
Server).
+2. **KMS Authorization:** The KMS also performs an authorization check (often 
via Ranger KMS plugin) to ensure the user can use the EZ Key for decryption.
+3. **`DEK` Retrieval:** KMS returns the raw `DEK` (Data Encryption Key) to the 
client.
+
+### 2.2 Initializing the Crypto Stream
+
+The client wraps the data stream in a `CryptoInputStream` initialized with the 
raw `DEK` and the IV from the metadata.
+
+---
+
+## 3. Phase 3: Data Retrieval (Client & Datanode)
+
+### 3.1 Fetching Encrypted Chunks
+
+The client's `ChunkInputStream` sends a `ReadChunk` request to a Datanode.
+
+- **Security:** The request includes the Block Token.
+- **Datanode Validation:** The Datanode verifies the token's signature using 
the Secret Keys it fetched from the SCM. This is the final "at-the-edge" 
authorization check.
+- **Data Transfer:** The Datanode reads the encrypted data from disk and 
streams it back.

Review Comment:
   Need to mention ReadBlock as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to