voonhous commented on code in PR #13152:
URL: https://github.com/apache/hudi/pull/13152#discussion_r3401323530


##########
rfc/rfc-94/rfc-94.md:
##########
@@ -0,0 +1,515 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# RFC-94: Hudi Timeline User Interface (UI)
+
+## Proposers
+
+- @voonhous
+
+## Approvers
+
+- @danny0405
+- @rahil-c
+- @yihua
+
+## Status
+
+JIRA: [HUDI-9315](https://issues.apache.org/jira/browse/HUDI-9315)
+
+## Abstract
+
+Hudi Timeline metadata is stored as timestamped files representing state 
transitions of actions like
+`commit`, `deltacommit` and `compaction`. These files are accessible via the 
CLI or a file explorer,
+but it's hard to visualize concurrent actions, spot missing transitions, or 
tell how long each step
+took. Debugging timeline issues by reading filenames is tedious.
+
+This RFC proposes a UI-based timeline visualization tool that parses these 
metadata files, groups
+related actions, and renders them in a time-ordered, interactive view. Users 
can track the lifecycle
+of each operation, see concurrency patterns, and spot anomalies or 
long-running tasks. The
+implementation extends `hudi-timeline-service` with new `/v2/` REST APIs and a 
static HTML +
+JavaScript frontend powered by 
[vis-timeline](https://github.com/visjs/vis-timeline), served via
+Javalin's built-in static file serving with zero new Java compile-time 
dependencies.
+
+## Background
+
+Today, we rely on the CLI or direct filesystem inspection to understand 
timeline state through
+metadata files. These files represent different actions (e.g., `deltacommit`, 
`compaction`) and
+their lifecycle states (`requested`, `inflight`, `completed`), encoded in file 
names like:
+
+```shell
+20250409102118815.deltacommit.inflight
+20250409102118815.deltacommit.requested
+20250409102118815_20250409102124339.deltacommit
+20250409102121593.compaction.inflight
+20250409102121593.compaction.requested
+20250409102121593_20250409102122232.commit
+20250409102124581.deltacommit.inflight
+20250409102124581.deltacommit.requested
+20250409102124581_20250409102125667.deltacommit
+20250409102124612.compaction.inflight
+20250409102124612.compaction.requested
+20250409102124612_20250409102124892.commit
+20250409102127348.deltacommit.inflight
+20250409102127348.deltacommit.requested
+20250409102127348_20250409102128481.deltacommit
+20250409102127500.compaction.inflight
+20250409102127500.compaction.requested
+20250409102127500_20250409102127721.commit
+```
+
+This works, but has a few problems:
+
+1. No visibility into concurrency
+    - Multiple actions (e.g., `deltacommit` and `compaction`) often run 
concurrently.
+    - The CLI doesn't help correlate or visualize overlapping operations.
+2. Lack of temporal context
+    - Timestamps are embedded in filenames but are hard to compare visually - 
year, month and
+      day can be quickly determined, but minutes and seconds are harder to 
parse.
+    - No easy way to tell how long an action took or whether it's stalling 
unless you
+      manually calculate the difference between requested and completion time.
+3. Hard to spot inconsistencies or missing states
+    - An `inflight` compaction without a corresponding `commit` can indicate a 
starved/stuck
+      compaction, which usually blocks archiving/cleaning.
+    - These gaps are easy to miss when scanning filenames.
+
+On top of that, all timeline files are now stored as Avro binaries. Inspecting 
their contents
+requires custom Avro readers to convert the binaries to JSON.
+
+## Scope
+
+This RFC covers visualization of metadata available in Hudi tables. All 
features are **READ-ONLY** -
+there is no support for starting or spawning jobs that mutate a Hudi table.
+
+The following are **out of scope**:
+
+- **Archived timeline:** Only the active timeline is rendered. Loading 
instants from LSM-based
+  archive files is left for future work.
+- **Metadata table overlay:** The metadata table's own timeline is not shown 
alongside the main
+  table timeline.
+- **Write/mutation operations:** The UI cannot trigger compactions, 
clustering, or any write action.
+- **Authentication/authorization:** No access control is added. The timeline 
server is assumed to
+  run in a trusted network, same as today.
+
+## Implementation
+
+Keeping the implementation lightweight is a priority - we should add as few 
dependencies as
+possible. Changes go into the existing `hudi-timeline-service` module, which 
contains a Javalin
+web-application that caches filesystem metadata of a Hudi table for job 
executors during
+tagging/writing.
+
+To use the Hudi Timeline UI, users can either start the Timeline Server in 
**STANDALONE** mode
+(which is already supported) or enable the UI on the **EMBEDDED** timeline 
server that runs within
+a Spark application's driver process (see [Configuration](#configuration)).
+
+The Hudi Timeline UI has two parts: the frontend and backend.
+
+### Architecture
+
+The timeline server can run standalone or embedded inside a Spark driver. In 
embedded mode, a tab
+in the Spark UI links directly to the Hudi Timeline UI.
+
+```mermaid
+graph LR
+    Browser["Browser"]
+
+    subgraph Driver["Standalone / Spark Driver"]
+        subgraph TimelineServer["Javalin (Timeline Server)"]
+            Static["/ui/* - Static Files\n(HTML, JS, CSS)"]
+            API["/v2/timeline/* - UiHandler"]
+            FSVM["FileSystemViewManager"]
+            Meta["HoodieTimeline / MetaClient"]
+
+            API --> FSVM --> Meta
+        end
+
+        subgraph SparkUI["Spark UI (:4040) - embedded mode only"]
+            direction TB
+            SparkUIPad[ ] ~~~ Tabs["[Jobs] [Stages] ... [Hudi Timeline]"]
+        end
+
+        style SparkUIPad fill:none,stroke:none,color:none
+
+        Tabs -- "link" --> Static
+    end
+
+    Browser -- "HTTP" --> Static
+    Browser -- "HTTP" --> API
+    Browser -. "HTTP\n(embedded mode)" .-> SparkUI
+```
+
+There are two categories of requests:
+
+1. **Static file requests** (`/ui/*`) - Javalin serves HTML, JavaScript, and 
CSS files from the
+   classpath (`src/main/resources/public/`). No server-side rendering or 
template engine is needed.
+2. **REST API requests** (`/v2/timeline/*`) - A new `UiHandler` processes 
these requests, reading
+   timeline data from the `FileSystemViewManager` and `HoodieTableMetaClient`, 
then returning JSON
+   responses.
+
+### Frontend
+
+The frontend is static HTML pages with vanilla JavaScript, similar to the 
Spark Web UI. Javalin's
+built-in static file serving handles files from the classpath - no template 
engine (e.g.,
+Thymeleaf) is needed and no new Java compile-time dependencies are added.
+
+No frontend build pipeline (npm, webpack, vite) is needed. Contributing to the 
UI requires only a
+text editor. The only external library is vis-timeline for timeline rendering.
+
+#### File Structure
+
+```
+hudi-timeline-service/src/main/resources/public/
+├── index.html                     # Landing page with basepath input form
+├── js/
+│   └── timeline.js                # vis-timeline initialization and REST API 
calls
+├── css/
+│   └── style.css                  # Basic styling
+└── lib/
+    └── vis-timeline/              # Bundled fallback copy of vis-timeline
+        ├── vis-timeline-graph2d.min.js
+        └── vis-timeline-graph2d.min.css
+```
+
+#### JavaScript Delivery: CDN with Bundled Fallback
+
+The vis-timeline library is loaded with a two-tier strategy:
+
+1. **Primary:** Load from the `unpkg.com` CDN for automatic patch updates.
+2. **Fallback:** If the CDN is unreachable (e.g., air-gapped environments), 
load from the bundled
+   copy at `/lib/vis-timeline/`.
+
+The bundled fallback adds ~300KB to the JAR but ensures the UI works without 
internet access.
+
+#### vis-timeline Configuration
+
+The timeline is configured with groups and items that map to Hudi's timeline 
model:
+
+- **Groups:** One row per action type - `commit`, `deltacommit`, `compaction`, 
`clean`, `rollback`,
+  `clustering`, `savepoint`, `logcompaction`, `indexing`, `restore`, 
`replacecommit`. These
+  correspond to the actions in `HoodieTimeline.VALID_ACTIONS_IN_TIMELINE`.
+- **Items:** Completed instants are rendered as range bars spanning from 
`requestedTime` to
+  `completionTime`. Non-completed instants (requested or inflight) are 
rendered as point items at
+  `requestedTime`.
+- **Color coding:** Items are colored by state:
+    - Green -> `COMPLETED`
+    - Yellow -> `INFLIGHT`
+    - Red -> `REQUESTED`
+- **Tooltip:** On hover, shows the action type, requested time, completion 
time, and duration.
+- **Click handler:** Clicking an instant fetches its detail via 
`/v2/timeline/instant/details` and
+  shows the deserialized JSON in a detail panel below the timeline.
+
+### Backend
+
+A `hudi-timeline-service` instance already serves filesystem metadata for 
multiple table basePaths
+since the `FileSystemView`s are cached in a map keyed by basepath.
+
+We extend this module with `/v2/` APIs to serve the timeline metadata needed 
by the UI.
+
+#### API Specification
+
+| Method | Path                           | Parameters                         
                                  | Response                | Description       
                                                                  |
+|--------|--------------------------------|----------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------------------------|
+| GET    | `/v2/timeline/instants`        | `basepath` (required)              
                                  | `List<InstantDTO>` (v2) | Returns all 
active instants with requested time, completion time, action, and state |
+| GET    | `/v2/timeline/instant/details` | `basepath` (required), 
`instantTime` (required), `action` (required) | JSON string             | 
Returns the deserialized content of a specific instant's metadata (Avro -> 
JSON)    |
+

Review Comment:
   Good point. Scoping note: this endpoint serves the active timeline only (the 
unbounded archived timeline is out of scope), and the active timeline is 
bounded by archiving, so counts are usually modest. The first cut returns all 
active instants and uses client-side zoom/scroll + filtering for navigation. 
Added a note to the API spec documenting that intent and that the endpoint can 
be extended additively with optional `from`/`to` time-range params (and/or a 
`limit`) if active-timeline sizes ever become a concern.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to