[PR] [Hackathon] feat: Workflow performance profiler + agent-driven optimization [texera]

via GitHub Sat, 16 May 2026 00:25:38 -0700


PG1204 opened a new pull request, #5098:
URL: https://github.com/apache/texera/pull/5098


   ### Demo Video
   
https://drive.google.com/file/d/1rRaCWynJkJE6WtomWQiceh9KCH0Qc48M/view?usp=drive_link
   
   ### What changes were proposed in this PR?
   
   A user runs a workflow today and sees a few numeric stat badges on the 
canvas. They have no visual signal for *which* operator is slow, *why* it's 
slow, or what to do about it. This PR closes that loop end-to-end and then lets 
the AI agent participate in it.
   
   #### Before / after
   
   | User task | Before | After |
   |---|---|---|
   | Spot a slow operator | Read raw stat badges | Canvas heatmap colors 
operators green → red by relative cost |
   | Understand *why* it's slow | No guidance | 6 rule-based hints in the side 
panel + canvas ghost suggestions |
   | Compare two runs | Not possible | Upload a downloaded JSON report **or** 
pick a past execution from the popover |
   | Apply a fix | Manual property editing | Click **Apply** on a canvas ghost 
or an agent proposal card |
   | Ask the agent about performance | Agent only sees workflow shape | Agent 
has 5 read-only profiler tools and a structured-proposal channel |
   | Get a smart Filter / worker default | Static defaults (first column / 4 
workers) | Agent reasons over schema + runtime to suggest informed values; 
rule-based fallback when offline |
   
   #### The story
   
   Turn on the profiler (gauge icon in the run bar). The canvas paints itself — 
the Python UDF that takes most of the wall-clock turns red, everything else 
stays green. Hover over the red operator and a tooltip shows its runtime, 
throughput, and idle ratio. The property panel adds a "Profiler" section 
listing the fired hints (`RUNTIME_OUTLIER`, `LOW_PARALLELISM_HOT_OP`, …) with 
plain-English messages.
   
   Hints that map to mechanical fixes also appear as **ghost suggestions** on 
the canvas: a "Bump workers" tag floats next to hot single-worker operators, 
and an "Insert Filter" ghost sits on edges where the rule engine sees an 
over-producing upstream. Click Apply and the change lands, with a "Run now" 
prompt so you can verify.
   
   Want to compare runs? Download a profiler report and re-upload it later, 
**or** open the popover dropdown and pick directly from past executions — the 
existing delta heatmap and side-panel UI render from either source.
   
   Open the agent chat and ask *"is anything slow?"* The agent calls 
`getProfilerSummary` and `getOptimizationHints`, then surfaces a structured 
proposal that renders inline as an **Apply / Reject card**. The agent never 
mutates the workflow itself — the frontend's Apply button is the only mutation 
path. Multi-step optimizations come back as a numbered plan card with per-step 
Apply plus an "Apply All" button.
   
   The canvas ghosts themselves get smarter when the agent is available: 
clicking "Insert Filter" calls a `proposeFilterPredicate` endpoint that reads 
the upstream schema and downstream context to fill in real `{attribute, 
condition, value}` rows instead of the rule-based `is not null` placeholder. 
Similarly, "Bump workers" calls `proposeWorkerCount` to pick a number based on 
runtime and idle ratio. Both fall back to the static defaults on any miss, so 
the feature works with or without the agent running.
   
   On the backend, a new `ProfilerScoring.scala` helper mirrors the frontend's 
three scoring formulas so any future server-side use (persisted stats, 
scheduler decisions) stays consistent with the UI. No call sites yet — purely 
future-use infrastructure.
   
   ### Any related issues, documentation, discussions?
   
   Related to the Apache Texera Agent Hackathon 
([#5059](https://github.com/apache/texera/discussions/5059#discussioncomment-16924043)).
   
   ### How was this PR tested?
   
   ```bash
   # Frontend
   cd frontend
   ./node_modules/.bin/tsc --noEmit -p tsconfig.json
   ./node_modules/.bin/ng test --watch=false \
     --include='**/profiler*.spec.ts' --include='**/agent-proposal*.spec.ts'
   ./node_modules/.bin/ng build
   
   # Agent-service
   cd agent-service
   bun test
   bunx tsc --noEmit
   ```
   
   258/258 frontend Vitest tests pass across 12 spec files; 147/147 
agent-service Bun tests pass across 10 spec files; both tsc --noEmit clean; ng 
build succeeds. The Scala spec for ProfilerScoring was not run locally — the 
amber sbt project hits a pre-existing AddMetaInfLicenseFiles not found plugin 
error unrelated to this PR; CI is the canonical validator.
   
   Manual end-to-end: built a CSVScan → Filter → heavy-Python-UDF → Visualize 
workflow, confirmed the heatmap reds the UDF; toggled all three views; uploaded 
a JSON report and confirmed delta heatmap; picked a past execution from the new 
dropdown and got the same result; asked the agent "is anything slow?" and 
confirmed the orange Apply/Reject card lands the change on the canvas; asked 
"what can we do to make this faster?" and confirmed the blue multi-step plan 
card renders with per-step Apply + Apply All; clicked Insert-Filter and 
Bump-Workers ghosts both with and without the agent running, confirming the 
fallback path.
   
   ### Was this PR authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Opus 4.7)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [Hackathon] feat: Workflow performance profiler + agent-driven optimization [texera]

Reply via email to