hudi-agent commented on code in PR #13152:
URL: https://github.com/apache/hudi/pull/13152#discussion_r3408537004


##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,545 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like `commit`,
+`deltacommit` and `compaction`. These files are accessible via the CLI or a 
file explorer, but it's hard to visualize
+concurrent actions, spot missing transitions, or tell how long each step took. 
Debugging timeline issues by reading
+filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups related actions, and
+renders them in a time-ordered, interactive view. Users can track the 
lifecycle of each operation, see concurrency
+patterns, and spot anomalies or long-running tasks. The implementation extends 
`hudi-timeline-service` with new `/v2/`
+REST APIs and a static HTML + JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline),
+served via Javalin's built-in static file serving with zero new Java 
compile-time dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through metadata files. These
+files represent different actions (e.g., `deltacommit`, `compaction`) and 
their lifecycle states (`requested`,
+`inflight`, `completed`), encoded in file names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and day can be quickly
+      determined, but minutes and seconds are harder to parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you manually calculate the difference
+      between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck compaction, which usually
+      blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents requires custom Avro
+readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** - there is no support
+for starting or spawning jobs that mutate a Hudi table.
+
+Alongside the timeline, the UI surfaces two additional read-only metadata 
views: the table's configuration
+(`hoodie.properties`) and its schema-change history.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based archive files is left for
+  future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to run in a trusted
+  network, same as today.
+
+  **Threat model:** The UI does not widen the timeline server's exposure 
surface. The `/v2/` endpoints read the same
+  active-timeline and filesystem metadata that the existing `/v1/` REST APIs 
already serve, on the same network
+  interface (the server binds to all interfaces on the driver/standalone 
host). The UI is also opt-in and off by default
+  (`--enable-ui`). Operators on untrusted networks should front the server 
with a reverse proxy or restrict it to a
+  private interface / localhost via network policy.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as possible. Changes go into
+the existing `hudi-timeline-service` module, which contains a Javalin 
web-application that caches filesystem metadata of
+a Hudi table for job executors during tagging/writing.
+
+The first cut runs the UI on the Timeline Server in **STANDALONE** mode (see 
[Configuration](#configuration)) and is
+self-contained within `hudi-timeline-service`. Enabling the UI on the 
**EMBEDDED** timeline server inside a Spark
+driver, together with a Spark UI tab, requires cross-module wiring 
(`hudi-client-common`, `hudi-spark-client`); it is
+designed below but deferred to a follow-up to keep the initial PR small and 
focused. The standalone UI lands first; the
+embedded/Spark linking lands next.
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab in the Spark UI links
+directly to the Hudi Timeline UI. The embedded mode and Spark UI tab (right 
side of the diagram below) are a planned
+follow-up; the first cut is standalone-only.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui + assets at root\n(HTML, JS, CSS)"]
+            API["/v2/hoodie/view/* - TimelineHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode (follow-up)"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** - Javalin serves HTML, JavaScript, and CSS files 
from the classpath
+   (`src/main/resources/public/`) at the server root; `UiHandler` serves 
`index.html` at `/ui`. No server-side
+   rendering or template engine is needed.
+2. **REST API requests** (`/v2/hoodie/view/*`) - `TimelineHandler` processes 
these requests, reading timeline data from
+   the `FileSystemViewManager` (and a per-basepath `HoodieTableMetaClient` for 
table config/schema), returning JSON.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's built-in static file
+serving handles files from the classpath - no template engine (e.g., 
Thymeleaf) is needed and no new Java compile-time
+dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a text editor. The only
+external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled vis-timeline assets
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: Bundled, No External Calls
+
+The vis-timeline library is served from the bundled copy at 
`/lib/vis-timeline/`. The UI makes no external network
+calls, so it works out of the box in air-gapped and security-conscious 
deployments with no extra configuration. The
+bundled assets add ~300KB to the JAR.
+
+Pinning a vendored copy (rather than loading from a CDN) keeps the UI 
deterministic and avoids a runtime dependency on
+an external host being reachable. If automatic patch updates are wanted later, 
a CDN source can be added as an opt-in
+config flag without changing this default.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`, `clustering`,
+  `savepoint`, `logcompaction`, `indexing`, `restore`, `replacecommit`. These 
correspond to the actions in
+  `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to `completionTime`.
+  Non-completed instants (requested or inflight) are rendered as point items 
at `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/hoodie/view/timeline/instant` and shows the
+  deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths since the
+`FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                                    | Parameters                
                                            | Response        | Description     
                                                                             |
+|--------|-----------------------------------------|-----------------------------------------------------------------------|-----------------|----------------------------------------------------------------------------------------------|
+| GET    | `/v2/hoodie/view/timeline/instants/all` | `basepath` (required)     
                                            | `TimelineDTOV2` | All active 
instants (each with requested time, completion time, action, state), wrapped in 
a timeline DTO |
+| GET    | `/v2/hoodie/view/timeline/instant`      | `basepath`, `instant`, 
`instantaction`, `instantstate` (all required) | JSON string     | Deserialized 
content of a specific instant's metadata (Avro -> JSON)                         
|
+| GET    | `/v2/hoodie/view/table/config`          | `basepath` (required)     
                                            | JSON object     | The table's 
`hoodie.properties` (sorted)                                                    
 |
+| GET    | `/v2/hoodie/view/table/schema/history`  | `basepath` (required), 
`limit` (optional, default 200, max 1000)      | JSON object     | Current 
table schema plus schema-change history from recent commits                     
     |
+
+Static files (HTML, JS, CSS) are served from the classpath under 
`src/main/resources/public/` at the server root (e.g.,
+`/js/timeline.js`, `/lib/...`). `UiHandler` additionally registers `GET /ui`, 
which returns `index.html` to give the UI
+a stable entry URL.
+
+**On response size and pagination:** `GET 
/v2/hoodie/view/timeline/instants/all` returns the full active timeline. The
+active timeline is bounded by archiving (the unbounded archived timeline is 
out of scope), so instant counts are
+typically modest. The first cut intentionally returns all active instants and 
relies on client-side zoom/scroll and
+filtering for navigation. If active-timeline sizes become a concern, the 
endpoint can be extended additively with
+optional `from`/`to` time-range query params (and/or a `limit`) without 
breaking the existing contract.
+
+#### DTO Design
+
+Two v2 DTOs are introduced in a `v2` package to avoid modifying the existing 
`/v1/` API contract:
+
+- **`InstantDTO`** (`o.a.h.common.table.timeline.dto.v2`) - the v1 
`InstantDTO` only exposes `action`, `timestamp`
+  (requested time), and `state`; it lacks completion time, needed for 
rendering range bars. The v2 `InstantDTO` has:
+    - `action` - the action type (e.g., `commit`, `deltacommit`, `compaction`)
+    - `requestedTime` (JSON `requestTs`) - requested timestamp 
(`HoodieInstant.requestedTime()`)
+    - `completionTime` (JSON `completionTs`) - completion timestamp 
(`HoodieInstant.getCompletionTime()`), null for
+      non-completed instants
+    - `state` - the instant state (`REQUESTED`, `INFLIGHT`, `COMPLETED`)
+- **`TimelineDTOV2`** - wraps a `List<InstantDTO>` (`instants`); this is what 
`/v2/hoodie/view/timeline/instants/all`
+  returns.
+
+#### Handler Design
+
+The v2 endpoints are served by the existing `TimelineHandler` (which already 
serves the v1 timeline routes); a separate
+`UiHandler` serves only the UI entry page.
+
+`TimelineHandler` methods:
+
+1. `getTimelineV2(basePath)` - maps 
`viewManager.getFileSystemView(basePath).getTimeline()` to a `TimelineDTOV2` 
(the
+   active timeline's instants, each including completion time).
+2. `getInstantDetails(basePath, instant, action, state)` - reads the instant's 
Avro content via the active timeline's
+   `getInstantDetails()` and deserializes it to JSON. The instant is created 
with the timeline's own layout-aware
+   `InstantGenerator`; a malformed `state`/`action` returns 400, a read 
failure is logged and returns 500.

Review Comment:
   🤖 The cached `HoodieTableMetaClient` is built once per basepath, but there's 
no invalidation strategy described. In an embedded server running inside a 
long-lived Spark driver, the table's schema or `hoodie.properties` can change 
after the cache is populated — the UI will then show stale values for the rest 
of the driver's lifetime. Could you spell out a refresh approach (TTL, reload 
on each request, or an explicit invalidation hook tied to timeline changes)?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,545 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like `commit`,
+`deltacommit` and `compaction`. These files are accessible via the CLI or a 
file explorer, but it's hard to visualize
+concurrent actions, spot missing transitions, or tell how long each step took. 
Debugging timeline issues by reading
+filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups related actions, and
+renders them in a time-ordered, interactive view. Users can track the 
lifecycle of each operation, see concurrency
+patterns, and spot anomalies or long-running tasks. The implementation extends 
`hudi-timeline-service` with new `/v2/`
+REST APIs and a static HTML + JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline),
+served via Javalin's built-in static file serving with zero new Java 
compile-time dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through metadata files. These
+files represent different actions (e.g., `deltacommit`, `compaction`) and 
their lifecycle states (`requested`,
+`inflight`, `completed`), encoded in file names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and day can be quickly
+      determined, but minutes and seconds are harder to parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you manually calculate the difference
+      between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck compaction, which usually
+      blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents requires custom Avro
+readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** - there is no support
+for starting or spawning jobs that mutate a Hudi table.
+
+Alongside the timeline, the UI surfaces two additional read-only metadata 
views: the table's configuration
+(`hoodie.properties`) and its schema-change history.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based archive files is left for
+  future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to run in a trusted
+  network, same as today.
+
+  **Threat model:** The UI does not widen the timeline server's exposure 
surface. The `/v2/` endpoints read the same
+  active-timeline and filesystem metadata that the existing `/v1/` REST APIs 
already serve, on the same network
+  interface (the server binds to all interfaces on the driver/standalone 
host). The UI is also opt-in and off by default
+  (`--enable-ui`). Operators on untrusted networks should front the server 
with a reverse proxy or restrict it to a
+  private interface / localhost via network policy.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as possible. Changes go into
+the existing `hudi-timeline-service` module, which contains a Javalin 
web-application that caches filesystem metadata of
+a Hudi table for job executors during tagging/writing.
+
+The first cut runs the UI on the Timeline Server in **STANDALONE** mode (see 
[Configuration](#configuration)) and is
+self-contained within `hudi-timeline-service`. Enabling the UI on the 
**EMBEDDED** timeline server inside a Spark
+driver, together with a Spark UI tab, requires cross-module wiring 
(`hudi-client-common`, `hudi-spark-client`); it is
+designed below but deferred to a follow-up to keep the initial PR small and 
focused. The standalone UI lands first; the
+embedded/Spark linking lands next.
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab in the Spark UI links
+directly to the Hudi Timeline UI. The embedded mode and Spark UI tab (right 
side of the diagram below) are a planned
+follow-up; the first cut is standalone-only.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui + assets at root\n(HTML, JS, CSS)"]
+            API["/v2/hoodie/view/* - TimelineHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode (follow-up)"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** - Javalin serves HTML, JavaScript, and CSS files 
from the classpath
+   (`src/main/resources/public/`) at the server root; `UiHandler` serves 
`index.html` at `/ui`. No server-side
+   rendering or template engine is needed.
+2. **REST API requests** (`/v2/hoodie/view/*`) - `TimelineHandler` processes 
these requests, reading timeline data from
+   the `FileSystemViewManager` (and a per-basepath `HoodieTableMetaClient` for 
table config/schema), returning JSON.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's built-in static file
+serving handles files from the classpath - no template engine (e.g., 
Thymeleaf) is needed and no new Java compile-time
+dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a text editor. The only
+external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled vis-timeline assets
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: Bundled, No External Calls
+
+The vis-timeline library is served from the bundled copy at 
`/lib/vis-timeline/`. The UI makes no external network
+calls, so it works out of the box in air-gapped and security-conscious 
deployments with no extra configuration. The
+bundled assets add ~300KB to the JAR.
+
+Pinning a vendored copy (rather than loading from a CDN) keeps the UI 
deterministic and avoids a runtime dependency on
+an external host being reachable. If automatic patch updates are wanted later, 
a CDN source can be added as an opt-in
+config flag without changing this default.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`, `clustering`,
+  `savepoint`, `logcompaction`, `indexing`, `restore`, `replacecommit`. These 
correspond to the actions in
+  `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to `completionTime`.
+  Non-completed instants (requested or inflight) are rendered as point items 
at `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/hoodie/view/timeline/instant` and shows the
+  deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths since the
+`FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                                    | Parameters                
                                            | Response        | Description     
                                                                             |
+|--------|-----------------------------------------|-----------------------------------------------------------------------|-----------------|----------------------------------------------------------------------------------------------|
+| GET    | `/v2/hoodie/view/timeline/instants/all` | `basepath` (required)     
                                            | `TimelineDTOV2` | All active 
instants (each with requested time, completion time, action, state), wrapped in 
a timeline DTO |
+| GET    | `/v2/hoodie/view/timeline/instant`      | `basepath`, `instant`, 
`instantaction`, `instantstate` (all required) | JSON string     | Deserialized 
content of a specific instant's metadata (Avro -> JSON)                         
|
+| GET    | `/v2/hoodie/view/table/config`          | `basepath` (required)     
                                            | JSON object     | The table's 
`hoodie.properties` (sorted)                                                    
 |
+| GET    | `/v2/hoodie/view/table/schema/history`  | `basepath` (required), 
`limit` (optional, default 200, max 1000)      | JSON object     | Current 
table schema plus schema-change history from recent commits                     
     |
+
+Static files (HTML, JS, CSS) are served from the classpath under 
`src/main/resources/public/` at the server root (e.g.,
+`/js/timeline.js`, `/lib/...`). `UiHandler` additionally registers `GET /ui`, 
which returns `index.html` to give the UI
+a stable entry URL.
+
+**On response size and pagination:** `GET 
/v2/hoodie/view/timeline/instants/all` returns the full active timeline. The
+active timeline is bounded by archiving (the unbounded archived timeline is 
out of scope), so instant counts are
+typically modest. The first cut intentionally returns all active instants and 
relies on client-side zoom/scroll and
+filtering for navigation. If active-timeline sizes become a concern, the 
endpoint can be extended additively with
+optional `from`/`to` time-range query params (and/or a `limit`) without 
breaking the existing contract.
+
+#### DTO Design
+
+Two v2 DTOs are introduced in a `v2` package to avoid modifying the existing 
`/v1/` API contract:
+
+- **`InstantDTO`** (`o.a.h.common.table.timeline.dto.v2`) - the v1 
`InstantDTO` only exposes `action`, `timestamp`
+  (requested time), and `state`; it lacks completion time, needed for 
rendering range bars. The v2 `InstantDTO` has:
+    - `action` - the action type (e.g., `commit`, `deltacommit`, `compaction`)
+    - `requestedTime` (JSON `requestTs`) - requested timestamp 
(`HoodieInstant.requestedTime()`)
+    - `completionTime` (JSON `completionTs`) - completion timestamp 
(`HoodieInstant.getCompletionTime()`), null for
+      non-completed instants
+    - `state` - the instant state (`REQUESTED`, `INFLIGHT`, `COMPLETED`)
+- **`TimelineDTOV2`** - wraps a `List<InstantDTO>` (`instants`); this is what 
`/v2/hoodie/view/timeline/instants/all`
+  returns.
+
+#### Handler Design
+
+The v2 endpoints are served by the existing `TimelineHandler` (which already 
serves the v1 timeline routes); a separate
+`UiHandler` serves only the UI entry page.
+
+`TimelineHandler` methods:
+
+1. `getTimelineV2(basePath)` - maps 
`viewManager.getFileSystemView(basePath).getTimeline()` to a `TimelineDTOV2` 
(the
+   active timeline's instants, each including completion time).
+2. `getInstantDetails(basePath, instant, action, state)` - reads the instant's 
Avro content via the active timeline's
+   `getInstantDetails()` and deserializes it to JSON. The instant is created 
with the timeline's own layout-aware
+   `InstantGenerator`; a malformed `state`/`action` returns 400, a read 
failure is logged and returns 500.
+3. `getTableConfig(basePath)` / `getSchemaHistory(basePath, limit)` - serve 
the table-config and schema-history views.
+   Both reuse a `HoodieTableMetaClient` cached per basepath in a 
`ConcurrentHashMap` (built once on first access), so
+   repeated requests pay only the targeted read, not metaClient construction.
+
+`UiHandler` registers `GET /ui`, returning `/public/index.html` from the 
classpath as the UI entry page.
+
+#### Registration in RequestHandler
+
+The v2 routes are registered following the existing pattern:
+
+- The v1 timeline routes remain registered unconditionally in 
`registerTimelineAPI()`.
+- The v2 UI routes are registered in `registerTimelineV2API()`, called from 
`register()` only when `--enable-ui` is set.
+  `UiHandler` (serving `/ui`) and the static-file serving are gated by the 
same flag.
+
+#### Error Handling
+
+- **Invalid basepath** -> HTTP 400 with a descriptive error message (e.g., 
"Not a valid Hudi table path").
+- **Empty timeline** -> Returns an empty list `[]`. The frontend displays "No 
instants found".
+- **Failed instant detail read** -> HTTP 500 with error details (e.g., Avro 
deserialization failure).
+
+### Feature
+
+The first cut presents three read-only tabs for a Hudi table: **Timeline**, 
**Table Config**, and **Schema History**.
+
+The permitted user actions are:
+
+1. User is able to input a Hudi table basepath
+2. User is able to click submit after inputting Hudi table basepath
+3. The timeline of the Hudi table is rendered
+4. User is able to scroll through timeline (horizontally)
+5. User is able to zoom in and out of timeline
+6. User is able to hover over instant for more details
+7. User is able to click on a specific instant and the JSON string of the 
timeline details are rendered
+8. User is able to view the table's configuration (`hoodie.properties`) in the 
Table Config tab
+9. User is able to view the table's schema and schema-change history in the 
Schema History tab
+
+Each action type occupies its own horizontal row so concurrent actions are 
visually separated. Completed instants appear
+as horizontal bars whose width represents duration (requested -> completed). 
Inflight and requested instants appear as
+point markers. Color indicates state: green for completed, yellow for 
inflight, red for requested.
+
+### Examples
+
+Proof of concept (PoC) snapshots:
+
+**Main Page with Timeline Rendered**
+![timeline_main](images/timeline_main.png)
+
+**Hovering Over an Instant**
+![timeline_hover](images/timeline_hover.png)
+
+**Selecting an Instant**
+![timeline_select](images/timeline_instant_select.png)
+
+## Configuration
+
+### Standalone Mode
+
+To start the Timeline Server in standalone mode with the UI enabled:
+
+```shell
+java -cp hudi-timeline-server-bundle-*.jar \
+  org.apache.hudi.timeline.service.TimelineService \
+  --server-port 26754 \
+  --enable-ui
+```
+
+Once started, the UI is accessible at `http://localhost:26754/ui`.
+
+The server port is configurable via the existing `--server-port` (or `-p`) 
flag (default: `26754`). The `--enable-ui`
+flag controls whether the UI static files, the `/ui` page, and the 
`/v2/hoodie/view/` UI API endpoints are registered.
+When the flag is not set, the timeline server behaves exactly as it does today 
- no UI-related routes are added.
+
+### Embedded Mode (Spark-Shell / Spark Driver)
+
+> **Status: deferred to a follow-up.** Embedded-mode UI enablement is 
intentionally split out of the initial PR to keep
+> it small: the standalone UI ships first, then the embedded server is wired 
to enable it. The design below is retained
+> for that follow-up.
+
+When running Hudi inside a Spark application, the `EmbeddedTimelineService` 
already starts a timeline server within the
+driver process. The UI can be enabled on this embedded server by setting a 
Spark configuration property:
+
+```
+hoodie.embed.timeline.server.ui.enable = true
+```
+
+This property defaults to `false`. When set to `true`, the embedded timeline 
server registers the same UI routes and
+static file serving as the standalone mode.
+
+#### Starting from spark-shell
+
+```shell
+spark-shell \
+  --packages org.apache.hudi:hudi-spark3-bundle_2.12:1.2.0 \
+  --conf "hoodie.embed.timeline.server.ui.enable=true"
+```
+
+Once a write operation initializes the `EmbeddedTimelineService`, the UI 
becomes available at

Review Comment:
   🤖 The Module Placement note says `EmbeddedTimelineService` (in 
`hudi-client-common`) coordinates registration by calling into the 
Spark-specific tab class in `hudi-spark-client`, but `hudi-client-common` 
cannot depend on `hudi-spark-client`. Could you specify the dispatch mechanism 
— `ServiceLoader`/SPI, reflection on a fully-qualified class name, or a 
callback registered through `HoodieEngineContext`? This is the kind of 
cross-module wiring that's worth pinning down before implementation. @bvaradar 
this touches a module-boundary contract worth a quick sanity check.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,545 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like `commit`,
+`deltacommit` and `compaction`. These files are accessible via the CLI or a 
file explorer, but it's hard to visualize
+concurrent actions, spot missing transitions, or tell how long each step took. 
Debugging timeline issues by reading
+filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups related actions, and
+renders them in a time-ordered, interactive view. Users can track the 
lifecycle of each operation, see concurrency
+patterns, and spot anomalies or long-running tasks. The implementation extends 
`hudi-timeline-service` with new `/v2/`
+REST APIs and a static HTML + JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline),
+served via Javalin's built-in static file serving with zero new Java 
compile-time dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through metadata files. These
+files represent different actions (e.g., `deltacommit`, `compaction`) and 
their lifecycle states (`requested`,
+`inflight`, `completed`), encoded in file names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and day can be quickly
+      determined, but minutes and seconds are harder to parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you manually calculate the difference
+      between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck compaction, which usually
+      blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents requires custom Avro
+readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** - there is no support
+for starting or spawning jobs that mutate a Hudi table.
+
+Alongside the timeline, the UI surfaces two additional read-only metadata 
views: the table's configuration
+(`hoodie.properties`) and its schema-change history.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based archive files is left for
+  future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to run in a trusted
+  network, same as today.
+
+  **Threat model:** The UI does not widen the timeline server's exposure 
surface. The `/v2/` endpoints read the same
+  active-timeline and filesystem metadata that the existing `/v1/` REST APIs 
already serve, on the same network
+  interface (the server binds to all interfaces on the driver/standalone 
host). The UI is also opt-in and off by default
+  (`--enable-ui`). Operators on untrusted networks should front the server 
with a reverse proxy or restrict it to a
+  private interface / localhost via network policy.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as possible. Changes go into
+the existing `hudi-timeline-service` module, which contains a Javalin 
web-application that caches filesystem metadata of
+a Hudi table for job executors during tagging/writing.
+
+The first cut runs the UI on the Timeline Server in **STANDALONE** mode (see 
[Configuration](#configuration)) and is
+self-contained within `hudi-timeline-service`. Enabling the UI on the 
**EMBEDDED** timeline server inside a Spark
+driver, together with a Spark UI tab, requires cross-module wiring 
(`hudi-client-common`, `hudi-spark-client`); it is
+designed below but deferred to a follow-up to keep the initial PR small and 
focused. The standalone UI lands first; the
+embedded/Spark linking lands next.
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab in the Spark UI links
+directly to the Hudi Timeline UI. The embedded mode and Spark UI tab (right 
side of the diagram below) are a planned
+follow-up; the first cut is standalone-only.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui + assets at root\n(HTML, JS, CSS)"]
+            API["/v2/hoodie/view/* - TimelineHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode (follow-up)"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** - Javalin serves HTML, JavaScript, and CSS files 
from the classpath
+   (`src/main/resources/public/`) at the server root; `UiHandler` serves 
`index.html` at `/ui`. No server-side
+   rendering or template engine is needed.
+2. **REST API requests** (`/v2/hoodie/view/*`) - `TimelineHandler` processes 
these requests, reading timeline data from
+   the `FileSystemViewManager` (and a per-basepath `HoodieTableMetaClient` for 
table config/schema), returning JSON.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's built-in static file
+serving handles files from the classpath - no template engine (e.g., 
Thymeleaf) is needed and no new Java compile-time
+dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a text editor. The only
+external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled vis-timeline assets
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: Bundled, No External Calls
+
+The vis-timeline library is served from the bundled copy at 
`/lib/vis-timeline/`. The UI makes no external network
+calls, so it works out of the box in air-gapped and security-conscious 
deployments with no extra configuration. The
+bundled assets add ~300KB to the JAR.
+
+Pinning a vendored copy (rather than loading from a CDN) keeps the UI 
deterministic and avoids a runtime dependency on
+an external host being reachable. If automatic patch updates are wanted later, 
a CDN source can be added as an opt-in
+config flag without changing this default.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`, `clustering`,
+  `savepoint`, `logcompaction`, `indexing`, `restore`, `replacecommit`. These 
correspond to the actions in
+  `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to `completionTime`.
+  Non-completed instants (requested or inflight) are rendered as point items 
at `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/hoodie/view/timeline/instant` and shows the
+  deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths since the
+`FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                                    | Parameters                
                                            | Response        | Description     
                                                                             |
+|--------|-----------------------------------------|-----------------------------------------------------------------------|-----------------|----------------------------------------------------------------------------------------------|
+| GET    | `/v2/hoodie/view/timeline/instants/all` | `basepath` (required)     
                                            | `TimelineDTOV2` | All active 
instants (each with requested time, completion time, action, state), wrapped in 
a timeline DTO |
+| GET    | `/v2/hoodie/view/timeline/instant`      | `basepath`, `instant`, 
`instantaction`, `instantstate` (all required) | JSON string     | Deserialized 
content of a specific instant's metadata (Avro -> JSON)                         
|
+| GET    | `/v2/hoodie/view/table/config`          | `basepath` (required)     
                                            | JSON object     | The table's 
`hoodie.properties` (sorted)                                                    
 |
+| GET    | `/v2/hoodie/view/table/schema/history`  | `basepath` (required), 
`limit` (optional, default 200, max 1000)      | JSON object     | Current 
table schema plus schema-change history from recent commits                     
     |
+
+Static files (HTML, JS, CSS) are served from the classpath under 
`src/main/resources/public/` at the server root (e.g.,
+`/js/timeline.js`, `/lib/...`). `UiHandler` additionally registers `GET /ui`, 
which returns `index.html` to give the UI
+a stable entry URL.

Review Comment:
   🤖 Could you specify how the schema-change history is reconstructed? The 
endpoint says "schema-change history from recent commits" with a default 
`limit=200` (max `1000`), but the actual source is ambiguous — scanning commit 
`extraMetadata` for `schema` keys, reading `.hoodie/.schema/` when 
`InternalSchema` is on, or both? The cost model (per-commit metadata read for 
up to 1000 commits) and the behavior for tables that never use schema evolution 
depend on the answer.
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,545 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like `commit`,
+`deltacommit` and `compaction`. These files are accessible via the CLI or a 
file explorer, but it's hard to visualize
+concurrent actions, spot missing transitions, or tell how long each step took. 
Debugging timeline issues by reading
+filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups related actions, and
+renders them in a time-ordered, interactive view. Users can track the 
lifecycle of each operation, see concurrency
+patterns, and spot anomalies or long-running tasks. The implementation extends 
`hudi-timeline-service` with new `/v2/`
+REST APIs and a static HTML + JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline),
+served via Javalin's built-in static file serving with zero new Java 
compile-time dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through metadata files. These
+files represent different actions (e.g., `deltacommit`, `compaction`) and 
their lifecycle states (`requested`,
+`inflight`, `completed`), encoded in file names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and day can be quickly
+      determined, but minutes and seconds are harder to parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you manually calculate the difference
+      between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck compaction, which usually
+      blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents requires custom Avro
+readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** - there is no support
+for starting or spawning jobs that mutate a Hudi table.
+
+Alongside the timeline, the UI surfaces two additional read-only metadata 
views: the table's configuration
+(`hoodie.properties`) and its schema-change history.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based archive files is left for
+  future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to run in a trusted
+  network, same as today.
+
+  **Threat model:** The UI does not widen the timeline server's exposure 
surface. The `/v2/` endpoints read the same
+  active-timeline and filesystem metadata that the existing `/v1/` REST APIs 
already serve, on the same network
+  interface (the server binds to all interfaces on the driver/standalone 
host). The UI is also opt-in and off by default
+  (`--enable-ui`). Operators on untrusted networks should front the server 
with a reverse proxy or restrict it to a
+  private interface / localhost via network policy.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as possible. Changes go into
+the existing `hudi-timeline-service` module, which contains a Javalin 
web-application that caches filesystem metadata of
+a Hudi table for job executors during tagging/writing.
+
+The first cut runs the UI on the Timeline Server in **STANDALONE** mode (see 
[Configuration](#configuration)) and is
+self-contained within `hudi-timeline-service`. Enabling the UI on the 
**EMBEDDED** timeline server inside a Spark
+driver, together with a Spark UI tab, requires cross-module wiring 
(`hudi-client-common`, `hudi-spark-client`); it is
+designed below but deferred to a follow-up to keep the initial PR small and 
focused. The standalone UI lands first; the
+embedded/Spark linking lands next.
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab in the Spark UI links
+directly to the Hudi Timeline UI. The embedded mode and Spark UI tab (right 
side of the diagram below) are a planned
+follow-up; the first cut is standalone-only.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui + assets at root\n(HTML, JS, CSS)"]
+            API["/v2/hoodie/view/* - TimelineHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode (follow-up)"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** - Javalin serves HTML, JavaScript, and CSS files 
from the classpath
+   (`src/main/resources/public/`) at the server root; `UiHandler` serves 
`index.html` at `/ui`. No server-side
+   rendering or template engine is needed.
+2. **REST API requests** (`/v2/hoodie/view/*`) - `TimelineHandler` processes 
these requests, reading timeline data from
+   the `FileSystemViewManager` (and a per-basepath `HoodieTableMetaClient` for 
table config/schema), returning JSON.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's built-in static file
+serving handles files from the classpath - no template engine (e.g., 
Thymeleaf) is needed and no new Java compile-time
+dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a text editor. The only
+external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled vis-timeline assets
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: Bundled, No External Calls
+
+The vis-timeline library is served from the bundled copy at 
`/lib/vis-timeline/`. The UI makes no external network
+calls, so it works out of the box in air-gapped and security-conscious 
deployments with no extra configuration. The
+bundled assets add ~300KB to the JAR.
+
+Pinning a vendored copy (rather than loading from a CDN) keeps the UI 
deterministic and avoids a runtime dependency on
+an external host being reachable. If automatic patch updates are wanted later, 
a CDN source can be added as an opt-in
+config flag without changing this default.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`, `clustering`,
+  `savepoint`, `logcompaction`, `indexing`, `restore`, `replacecommit`. These 
correspond to the actions in
+  `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to `completionTime`.
+  Non-completed instants (requested or inflight) are rendered as point items 
at `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/hoodie/view/timeline/instant` and shows the
+  deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths since the
+`FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                                    | Parameters                
                                            | Response        | Description     
                                                                             |
+|--------|-----------------------------------------|-----------------------------------------------------------------------|-----------------|----------------------------------------------------------------------------------------------|
+| GET    | `/v2/hoodie/view/timeline/instants/all` | `basepath` (required)     
                                            | `TimelineDTOV2` | All active 
instants (each with requested time, completion time, action, state), wrapped in 
a timeline DTO |
+| GET    | `/v2/hoodie/view/timeline/instant`      | `basepath`, `instant`, 
`instantaction`, `instantstate` (all required) | JSON string     | Deserialized 
content of a specific instant's metadata (Avro -> JSON)                         
|
+| GET    | `/v2/hoodie/view/table/config`          | `basepath` (required)     
                                            | JSON object     | The table's 
`hoodie.properties` (sorted)                                                    
 |
+| GET    | `/v2/hoodie/view/table/schema/history`  | `basepath` (required), 
`limit` (optional, default 200, max 1000)      | JSON object     | Current 
table schema plus schema-change history from recent commits                     
     |
+
+Static files (HTML, JS, CSS) are served from the classpath under 
`src/main/resources/public/` at the server root (e.g.,
+`/js/timeline.js`, `/lib/...`). `UiHandler` additionally registers `GET /ui`, 
which returns `index.html` to give the UI
+a stable entry URL.
+
+**On response size and pagination:** `GET 
/v2/hoodie/view/timeline/instants/all` returns the full active timeline. The
+active timeline is bounded by archiving (the unbounded archived timeline is 
out of scope), so instant counts are
+typically modest. The first cut intentionally returns all active instants and 
relies on client-side zoom/scroll and
+filtering for navigation. If active-timeline sizes become a concern, the 
endpoint can be extended additively with

Review Comment:
   🤖 Mounting the static assets at the server root (`/js/...`, `/css/...`, 
`/lib/...`) reserves these path prefixes from any future `/v1/`, `/v2/`, or 
other module-registered routes on the same Javalin instance. Could you 
namespace them under `/ui/static/...` (matching the `/ui` entry page) so the UI 
surface and the API surface don't collide as new endpoints are added?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,545 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like `commit`,
+`deltacommit` and `compaction`. These files are accessible via the CLI or a 
file explorer, but it's hard to visualize
+concurrent actions, spot missing transitions, or tell how long each step took. 
Debugging timeline issues by reading
+filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups related actions, and
+renders them in a time-ordered, interactive view. Users can track the 
lifecycle of each operation, see concurrency
+patterns, and spot anomalies or long-running tasks. The implementation extends 
`hudi-timeline-service` with new `/v2/`
+REST APIs and a static HTML + JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline),
+served via Javalin's built-in static file serving with zero new Java 
compile-time dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through metadata files. These
+files represent different actions (e.g., `deltacommit`, `compaction`) and 
their lifecycle states (`requested`,
+`inflight`, `completed`), encoded in file names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and day can be quickly
+      determined, but minutes and seconds are harder to parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you manually calculate the difference
+      between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck compaction, which usually
+      blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents requires custom Avro
+readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** - there is no support
+for starting or spawning jobs that mutate a Hudi table.
+
+Alongside the timeline, the UI surfaces two additional read-only metadata 
views: the table's configuration
+(`hoodie.properties`) and its schema-change history.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based archive files is left for
+  future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to run in a trusted
+  network, same as today.
+
+  **Threat model:** The UI does not widen the timeline server's exposure 
surface. The `/v2/` endpoints read the same
+  active-timeline and filesystem metadata that the existing `/v1/` REST APIs 
already serve, on the same network
+  interface (the server binds to all interfaces on the driver/standalone 
host). The UI is also opt-in and off by default
+  (`--enable-ui`). Operators on untrusted networks should front the server 
with a reverse proxy or restrict it to a
+  private interface / localhost via network policy.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as possible. Changes go into
+the existing `hudi-timeline-service` module, which contains a Javalin 
web-application that caches filesystem metadata of
+a Hudi table for job executors during tagging/writing.
+
+The first cut runs the UI on the Timeline Server in **STANDALONE** mode (see 
[Configuration](#configuration)) and is
+self-contained within `hudi-timeline-service`. Enabling the UI on the 
**EMBEDDED** timeline server inside a Spark
+driver, together with a Spark UI tab, requires cross-module wiring 
(`hudi-client-common`, `hudi-spark-client`); it is
+designed below but deferred to a follow-up to keep the initial PR small and 
focused. The standalone UI lands first; the
+embedded/Spark linking lands next.
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab in the Spark UI links
+directly to the Hudi Timeline UI. The embedded mode and Spark UI tab (right 
side of the diagram below) are a planned
+follow-up; the first cut is standalone-only.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui + assets at root\n(HTML, JS, CSS)"]
+            API["/v2/hoodie/view/* - TimelineHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode (follow-up)"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** - Javalin serves HTML, JavaScript, and CSS files 
from the classpath
+   (`src/main/resources/public/`) at the server root; `UiHandler` serves 
`index.html` at `/ui`. No server-side
+   rendering or template engine is needed.
+2. **REST API requests** (`/v2/hoodie/view/*`) - `TimelineHandler` processes 
these requests, reading timeline data from
+   the `FileSystemViewManager` (and a per-basepath `HoodieTableMetaClient` for 
table config/schema), returning JSON.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's built-in static file
+serving handles files from the classpath - no template engine (e.g., 
Thymeleaf) is needed and no new Java compile-time
+dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a text editor. The only
+external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled vis-timeline assets
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: Bundled, No External Calls
+
+The vis-timeline library is served from the bundled copy at 
`/lib/vis-timeline/`. The UI makes no external network
+calls, so it works out of the box in air-gapped and security-conscious 
deployments with no extra configuration. The
+bundled assets add ~300KB to the JAR.
+
+Pinning a vendored copy (rather than loading from a CDN) keeps the UI 
deterministic and avoids a runtime dependency on
+an external host being reachable. If automatic patch updates are wanted later, 
a CDN source can be added as an opt-in
+config flag without changing this default.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`, `clustering`,
+  `savepoint`, `logcompaction`, `indexing`, `restore`, `replacecommit`. These 
correspond to the actions in
+  `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to `completionTime`.
+  Non-completed instants (requested or inflight) are rendered as point items 
at `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/hoodie/view/timeline/instant` and shows the
+  deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths since the
+`FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                                    | Parameters                
                                            | Response        | Description     
                                                                             |
+|--------|-----------------------------------------|-----------------------------------------------------------------------|-----------------|----------------------------------------------------------------------------------------------|
+| GET    | `/v2/hoodie/view/timeline/instants/all` | `basepath` (required)     
                                            | `TimelineDTOV2` | All active 
instants (each with requested time, completion time, action, state), wrapped in 
a timeline DTO |
+| GET    | `/v2/hoodie/view/timeline/instant`      | `basepath`, `instant`, 
`instantaction`, `instantstate` (all required) | JSON string     | Deserialized 
content of a specific instant's metadata (Avro -> JSON)                         
|
+| GET    | `/v2/hoodie/view/table/config`          | `basepath` (required)     
                                            | JSON object     | The table's 
`hoodie.properties` (sorted)                                                    
 |
+| GET    | `/v2/hoodie/view/table/schema/history`  | `basepath` (required), 
`limit` (optional, default 200, max 1000)      | JSON object     | Current 
table schema plus schema-change history from recent commits                     
     |
+
+Static files (HTML, JS, CSS) are served from the classpath under 
`src/main/resources/public/` at the server root (e.g.,
+`/js/timeline.js`, `/lib/...`). `UiHandler` additionally registers `GET /ui`, 
which returns `index.html` to give the UI

Review Comment:
   🤖 The `/v2/hoodie/view/table/config` endpoint returns the full 
`hoodie.properties` content. These can include sensitive material — encryption 
key references, lock-provider connection strings, KMS endpoints, vault paths, 
etc. The threat-model note argues parity with `/v1/`, but `/v1/` does not 
expose `hoodie.properties`. Could you either filter known-sensitive keys 
(allowlist or denylist) or explicitly call out this expanded exposure in the 
threat model so operators understand what's on the wire?
   
   <sub><i>- AI-generated; verify before applying. React 👍/👎 to flag 
quality.</i></sub>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to