[PR] feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy [texera]

via GitHub Sun, 17 May 2026 12:41:28 -0700


PG1204 opened a new pull request, #5123:
URL: https://github.com/apache/texera/pull/5123


   ### Summary
   Related to issue #5041 
   
   First PR in a stacked series landing the HuggingFace operator end-to-end. 
This PR adds **only** the backend REST resource — no operator code yet. The 
resource is independently useful (the frontend can already integrate with it) 
and lets reviewers absorb the API surface before the operator class lands.
   
   ### What's changes were proposed in this PR?
   
   Introduces `HuggingFaceModelResource`, a Jersey resource registered at 
`/api/huggingface/*`:
   
   | Endpoint | Purpose |
   |---|---|
   | `GET /api/huggingface/models` | Browse / search models per task. Uses an 
in-process cache for browse mode and forwards to HF Hub for search. |
   | `GET /api/huggingface/tasks` | Fetch HF pipeline tags filtered to tasks 
with hosted inference. Cached process-lifetime. |
   | `POST /api/huggingface/upload-audio` | Upload audio bytes for HF audio 
tasks; stores in `/tmp/texera-hf-audio/` and returns the file path. |
   | `GET /api/huggingface/audio-preview` | Stream uploaded audio 
(path-validated to prevent traversal). |
   | `GET /api/huggingface/media-proxy` | Proxy remote media URLs (HF inference 
responses) to bypass browser CORS. |
   
   Also a one-line registration in `TexeraWebApplication.scala`.
   
   ### Stacked PR plan
   
   This is **PR 1 of ~9**. Subsequent PRs will add:
   
   - PR 2: refactored `HuggingFaceInferenceOpDesc` skeleton + text-generation 
codegen
   - PRs 3–5: per-task-family codegen (image, audio + media-gen, QA/ranking)
   - PRs 6–8: frontend (task/model selector, property-editor visibility, 
result-panel media)
   - PR 9: developer docs
   
   ### Test plan
   
   - [x] `sbt "amber/test"` passes locally
   - [ ] Hit `GET /api/huggingface/tasks` and confirm JSON list of supported 
tasks
   - [ ] Hit `GET /api/huggingface/models?task=text-generation` and confirm 
paginated model list
   - [ ] `POST /api/huggingface/upload-audio` with a small WAV, then fetch via 
`/api/huggingface/audio-preview` and confirm the bytes match
   - [ ] `GET /api/huggingface/media-proxy?url=…` with a known HF inference 
response URL and confirm the response is streamed
   
   ### Authored / co-authored using generative AI tooling?
   
   Co-authored with Claude Opus 4.7


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat(huggingFace): add HuggingFaceModelResource for model browsing and media proxy [texera]

Reply via email to