Re: [PR] docs(blog): new blog for Hudi NBCC [hudi]

via GitHub Tue, 16 Dec 2025 14:31:03 -0800


Copilot commented on code in PR #17613:
URL: https://github.com/apache/hudi/pull/17613#discussion_r2624961134



##########
website/blog/2025-12-16-maximizing-throughput-nbcc.md:
##########
@@ -0,0 +1,138 @@
+---
+title: "Maximizing Throughput with Apache Hudi NBCC: Stop Retrying, Start 
Scaling"
+excerpt: "Learn how Hudi's Non-Blocking Concurrency Control eliminates retry 
storms for concurrent writers, maximizing throughput in streaming and mixed 
workloads."
+author: "Shiyan Xu"
+category: blog
+image: 
/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p6-nbcc-compaction.png
+tags:
+  - hudi
+  - data lakehouse
+  - concurrency control
+  - streaming
+---
+
+Data lakehouses often run multiple concurrent writers—streaming ingestion, 
batch ETL, maintenance jobs. The default approach, Optimistic Concurrency 
Control (OCC), assumes conflicts are rare and handles them through retries. 
That assumption breaks down in increasingly common scenarios, such as running 
maintenance batch jobs on tables receiving streaming writes. When conflicts 
become the norm, retries pile up with OCC, and the write throughput tanks.
+
+Hudi introduced [Non-Blocking Concurrency Control 
(NBCC)](https://hudi.apache.org/docs/concurrency_control#non-blocking-concurrency-control)
 in release 1.0, solving this problem by allowing writers to append data files 
in parallel and using the write completion time to determine the serialization 
order for reads or merges. We'll explore why OCC struggles under real-world 
concurrency, how NBCC works under the hood, and how to configure NBCC in your 
pipelines.
+
+## The Problem with Retries
+
+Picture this scenario: your streaming pipeline ingests clickstream data every 
minute from multiple Kafka topics. A nightly GDPR deletion job kicks off at 
midnight, scanning across thousands of partitions to purge user records—also 
touching data files the ingestion pipeline is actively writing to. By 3 AM, you 
get paged—the deletion job has failed repeatedly, burning compute resources 
while the ingestion writer keeps winning the race to commit.
+
+![p1-occ-retries](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p1-occ-retries.png)

Review Comment:
   Missing image file "p1-occ-retries.png". The blog post references this image 
on line 22, but only three PNG files are included in the PR 
(p3-recordkey-filegroup.png, p4-completion-time.png, and 
p6-nbcc-compaction.png). The images p1-occ-retries.png, p2-nbcc-overview.png, 
and p5-truetime.gif are referenced but not included.
   ```suggestion
   *(Diagram: OCC retries stacking up under heavy contention between 
long-running and short-running writers.)*
   ```



##########
website/blog/2025-12-16-maximizing-throughput-nbcc.md:
##########
@@ -0,0 +1,138 @@
+---
+title: "Maximizing Throughput with Apache Hudi NBCC: Stop Retrying, Start 
Scaling"
+excerpt: "Learn how Hudi's Non-Blocking Concurrency Control eliminates retry 
storms for concurrent writers, maximizing throughput in streaming and mixed 
workloads."
+author: "Shiyan Xu"
+category: blog
+image: 
/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p6-nbcc-compaction.png
+tags:
+  - hudi
+  - data lakehouse
+  - concurrency control
+  - streaming
+---
+
+Data lakehouses often run multiple concurrent writers—streaming ingestion, 
batch ETL, maintenance jobs. The default approach, Optimistic Concurrency 
Control (OCC), assumes conflicts are rare and handles them through retries. 
That assumption breaks down in increasingly common scenarios, such as running 
maintenance batch jobs on tables receiving streaming writes. When conflicts 
become the norm, retries pile up with OCC, and the write throughput tanks.
+
+Hudi introduced [Non-Blocking Concurrency Control 
(NBCC)](https://hudi.apache.org/docs/concurrency_control#non-blocking-concurrency-control)
 in release 1.0, solving this problem by allowing writers to append data files 
in parallel and using the write completion time to determine the serialization 
order for reads or merges. We'll explore why OCC struggles under real-world 
concurrency, how NBCC works under the hood, and how to configure NBCC in your 
pipelines.
+
+## The Problem with Retries
+
+Picture this scenario: your streaming pipeline ingests clickstream data every 
minute from multiple Kafka topics. A nightly GDPR deletion job kicks off at 
midnight, scanning across thousands of partitions to purge user records—also 
touching data files the ingestion pipeline is actively writing to. By 3 AM, you 
get paged—the deletion job has failed repeatedly, burning compute resources 
while the ingestion writer keeps winning the race to commit.
+
+![p1-occ-retries](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p1-occ-retries.png)
+
+OCC assumes conflicts are rare—an assumption that held in traditional 
batch-oriented data lakes where jobs were scheduled sequentially. Most 
transactions will not overlap, so let them proceed optimistically and check for 
conflicts at commit time. But high-frequency streaming breaks this assumption: 
when you have minute-level ingestion plus long-running maintenance jobs, 
overlapping writes are not the exception—they are the norm.
+
+This is a classic concurrency anti-pattern: under OCC, conflict probability 
grows with transaction duration. Long-running jobs competing against frequent 
short writes lose nearly every commit race and retry indefinitely. When both 
concurrent writers are running ingestion, without careful coordination between 
the writers (e.g., segregating writers by partitions), the consequences become 
more severe: conflicts occur more often, overall throughput is reduced, and 
compute costs increase. The key insight is that retries are the throughput 
killer—we need a fundamentally different approach.
+
+## Hudi NBCC: Write in Parallel, Serialize by Completion Time
+
+NBCC avoids conflicts by design: let every writer append updates to Hudi’s log 
files in the Merge-on-Read (MOR) table, then let readers or mergers follow the 
serialization order based on write completion time. Let's say there are two 
writers, both updating a record concurrently. Under NBCC, each writer produces 
its own log file containing the update. Since there's no file contention, 
there's nothing to conflict on. At read time or during compaction, Hudi follows 
the write completion time and processes the associated log files in the proper 
order.
+
+![p2-nbcc-overview](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p2-nbcc-overview.png)

Review Comment:
   Missing image file "p2-nbcc-overview.png". This image is referenced on line 
32 but not included in the PR.
   ```suggestion
   
![p2-nbcc-overview](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p6-nbcc-compaction.png)
   ```



##########
website/blog/2025-12-16-maximizing-throughput-nbcc.md:
##########
@@ -0,0 +1,138 @@
+---
+title: "Maximizing Throughput with Apache Hudi NBCC: Stop Retrying, Start 
Scaling"
+excerpt: "Learn how Hudi's Non-Blocking Concurrency Control eliminates retry 
storms for concurrent writers, maximizing throughput in streaming and mixed 
workloads."
+author: "Shiyan Xu"
+category: blog
+image: 
/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p6-nbcc-compaction.png
+tags:
+  - hudi
+  - data lakehouse
+  - concurrency control
+  - streaming
+---
+
+Data lakehouses often run multiple concurrent writers—streaming ingestion, 
batch ETL, maintenance jobs. The default approach, Optimistic Concurrency 
Control (OCC), assumes conflicts are rare and handles them through retries. 
That assumption breaks down in increasingly common scenarios, such as running 
maintenance batch jobs on tables receiving streaming writes. When conflicts 
become the norm, retries pile up with OCC, and the write throughput tanks.
+
+Hudi introduced [Non-Blocking Concurrency Control 
(NBCC)](https://hudi.apache.org/docs/concurrency_control#non-blocking-concurrency-control)
 in release 1.0, solving this problem by allowing writers to append data files 
in parallel and using the write completion time to determine the serialization 
order for reads or merges. We'll explore why OCC struggles under real-world 
concurrency, how NBCC works under the hood, and how to configure NBCC in your 
pipelines.
+
+## The Problem with Retries
+
+Picture this scenario: your streaming pipeline ingests clickstream data every 
minute from multiple Kafka topics. A nightly GDPR deletion job kicks off at 
midnight, scanning across thousands of partitions to purge user records—also 
touching data files the ingestion pipeline is actively writing to. By 3 AM, you 
get paged—the deletion job has failed repeatedly, burning compute resources 
while the ingestion writer keeps winning the race to commit.
+
+![p1-occ-retries](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p1-occ-retries.png)
+
+OCC assumes conflicts are rare—an assumption that held in traditional 
batch-oriented data lakes where jobs were scheduled sequentially. Most 
transactions will not overlap, so let them proceed optimistically and check for 
conflicts at commit time. But high-frequency streaming breaks this assumption: 
when you have minute-level ingestion plus long-running maintenance jobs, 
overlapping writes are not the exception—they are the norm.
+
+This is a classic concurrency anti-pattern: under OCC, conflict probability 
grows with transaction duration. Long-running jobs competing against frequent 
short writes lose nearly every commit race and retry indefinitely. When both 
concurrent writers are running ingestion, without careful coordination between 
the writers (e.g., segregating writers by partitions), the consequences become 
more severe: conflicts occur more often, overall throughput is reduced, and 
compute costs increase. The key insight is that retries are the throughput 
killer—we need a fundamentally different approach.
+
+## Hudi NBCC: Write in Parallel, Serialize by Completion Time
+
+NBCC avoids conflicts by design: let every writer append updates to Hudi’s log 
files in the Merge-on-Read (MOR) table, then let readers or mergers follow the 
serialization order based on write completion time. Let's say there are two 
writers, both updating a record concurrently. Under NBCC, each writer produces 
its own log file containing the update. Since there's no file contention, 
there's nothing to conflict on. At read time or during compaction, Hudi follows 
the write completion time and processes the associated log files in the proper 
order.
+
+![p2-nbcc-overview](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p2-nbcc-overview.png)
+
+Both OCC and NBCC require locking—OCC during commit validation, NBCC during 
timestamp generation. The key difference is how long the lock is held, and what 
happens after. OCC holds the lock while validating: for concurrent commits, it 
compares the sets of written files to detect conflicts—so validation time grows 
with both transaction size and concurrent writer count. If validation detects a 
conflict, the losing writers discard their completed work and retry. NBCC's 
lock duration is a negligible constant (a configurable clock skew duration, 
200ms by default) regardless of transaction size: acquire lock, generate 
timestamp, sleep for clock skew, release. No file-level validation, no conflict 
detection, no retries.
+
+|                | OCC                                             | NBCC      
                                  |
+|:---------------|:------------------------------------------------|:--------------------------------------------|
+| On conflict    | Abort and retry                                 | No 
conflicts—each writer appends separately |
+| Lock duration  | Scales with the number of written files to validate | 
Constant (brief clock skew duration)        |
+| Resource waste | High                                            | Nearly 
none                                 |
+
+Hudi supports both OCC and NBCC for multi-writer scenarios. Hudi also offers 
[early conflict 
detection](https://hudi.apache.org/docs/concurrency_control/#early-conflict-detection)
 for OCC, which can reduce wasted work by failing faster. However, OCC's 
validation lock duration still exceeds NBCC's timestamp generation time, and 
retries still occur after conflicts are detected—both impacting overall write 
throughput.
+
+## How NBCC Works Under the Hood
+
+Hudi NBCC relies on several design foundations to enable conflict-free 
concurrent writes and maximize throughput.
+
+### Record Keys and File Groups
+
+Hudi organizes data into file groups, where records with the same record key 
always route to the same file group. Hudi uses 
[indexes](https://hudi.apache.org/docs/indexes) to efficiently route records to 
file groups. For MOR tables, updates don't rewrite base files—instead, writers 
append updates to log files within the file group.
+
+![p3-recordkey-filegroup](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p3-recordkey-filegroup.png)
+
+This record colocation is a key foundation for making NBCC possible. Records 
and their updates will always be routed to the same file group based on record 
keys, either in base files or log files—all associated with the same file ID 
that identifies the file group. The record key to file group mapping and the 
file ID association support read and merge operations by efficiently locating 
files to process. Also, concurrent Hudi writers use timestamps and write tokens 
to generate non-conflicting file names within each file group, and thus data 
writing can be non-blocking.
+
+### Completion Time: Serializing Concurrent Writes
+
+With NBCC, concurrent writers produce log files whose write transactions 
overlap in time. To process these files correctly, we need a proper 
serialization order. Consider: Writer A starts deltacommit 1 at T1 and 
completes at T5; Writer B starts deltacommit 2 at T2 and completes at T4. If we 
order by start timestamp, deltacommit 1 would be processed first—but it 
actually finished later than deltacommit 2. Completion time reflects the 
correct order for processing the files.
+
+![p4-completion-time](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p4-completion-time.png)
+
+Tracking the completion time is critical for NBCC. Concurrent writers flush 
records to files in parallel without any guarantee of completion order based on 
the start time. Hudi timeline tracks when each write is actually completed, 
enabling the correct serialization order for the writes.
+
+### TrueTime-like Timestamp Generation
+
+Distributed writers running on different machines face clock skew—their local 
clocks may differ by tens or even hundreds of milliseconds. Without 
coordination, two writers could generate the same timestamp or produce 
incorrect ordering.
+
+Hudi solves this with a TrueTime-like mechanism inspired by [Google 
Spanner](https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf):
+
+![p5-truetime](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p5-truetime.gif)

Review Comment:
   Missing image file "p5-truetime.gif". This image is referenced on line 70 
but not included in the PR.
   ```suggestion
   
![p5-truetime](/assets/images/blog/2025-12-16-maximizing-throughput-nbcc/p5-truetime.png)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] docs(blog): new blog for Hudi NBCC [hudi]

Reply via email to